docs: add Phase 4 implementation plan (vector retrieval + branching + polish)
15 tasks across 8 waves landing the Phase 4 deliverables per requirements doc §13 + §14: - Vector retrieval via sqlite-vec (new external dependency) - Branching UI (event log forks) - Drawer-edit on every field (significance review, hide-from-view, surgical delete with cascade preview, branching affordances) - Backup tooling (snapshot UX surface) - Cross-chat search Plus the 3 Phase 3.6 carry-over fixes (T90 bundle). Wave structure: - W1 (parallel 3-way): schema foundation + carry-overs - W2 (parallel 3-way): embedding/search services - W3 (parallel 2-way): branching + delete services - W4 (single): combined retrieval ranking - W5 (single): memory write hook + backfill - W6 (single): drawer Phase 4 bundle (5 sub-features) - W7 (parallel 2-way): snapshot UX + cross-chat search UX - W8 (parallel 2-way): integration tests + docs External dependency: sqlite-vec must be installed BEFORE Wave 1. Embedding model choice (384-dim default) pinned in T91 before dispatch since the migration hardcodes the dimension. Schema baseline: 11 -> 13 (adds 0012_embeddings.sql + 0013_branches.sql). Task ids T88-T102 to avoid collision with prior phases.
This commit is contained in:
@@ -0,0 +1,832 @@
|
|||||||
|
# Roleplay Engine — Phase 4 Implementation Plan
|
||||||
|
|
||||||
|
> **For Claude:** REQUIRED SUB-SKILL: Use `superpowers-extended-cc:executing-plans` to implement this plan task-by-task. Use the parallel-dispatch pattern documented under "Parallel-Execution Strategy" for parallel waves.
|
||||||
|
|
||||||
|
**Goal:** Land Phase 4 polish per requirements doc §13 + §14: vector retrieval, branching UI, drawer-edit on every field, backup tooling, significance review UI, surgical delete with cascade preview, hide-from-view soft delete, plus cross-chat search and the small Phase 3.6 carry-over fixes.
|
||||||
|
|
||||||
|
**Architecture:** Builds on Phase 3.5's stable base. Two new tables (`embeddings`, `branches`) and one external dependency (sqlite-vec extension). Embedding generation runs as a deferred async job — NOT inline with turns — so the play loop stays fast even when the embedding endpoint is slow. Branching is data-model-only at first (events + selectors); UI grafts on top. Surgical delete + cascade preview reuses the existing rewind-and-supersede plumbing. Cross-chat search piggybacks on the existing FTS5 + (now) vector retrieval.
|
||||||
|
|
||||||
|
**Tech Stack:**
|
||||||
|
|
||||||
|
- **NEW dependency: `sqlite-vec`** (or `sqlite-vss` — Phase 4 picks; recommended `sqlite-vec` for simpler load semantics and active maintenance). Add to `pyproject.toml`.
|
||||||
|
- **Embedding model selection** is part of T91 spec. Recommended default: a small model on Featherless (e.g., `BAAI/bge-small-en-v1.5` if available) or a local CPU-friendly model via `sentence-transformers`. Document choice in CLAUDE.md.
|
||||||
|
- Same as Phase 3 otherwise (Python 3.11+, FastAPI, HTMX, SQLite).
|
||||||
|
|
||||||
|
**Source-of-truth references:**
|
||||||
|
|
||||||
|
- Phase 4 scope: requirements doc §13 "Phase 4 — polish" + §14 "Open / Deferred Decisions".
|
||||||
|
- Behavioral details: §6 (prompt assembly + retrieval), §10 (rewind / regenerate / reset), §11 (compression + significance), §12 (snapshots).
|
||||||
|
- Conventions: [`CLAUDE.md`](../../CLAUDE.md) §"Behavioral defaults" + §"Phase 3 status" + §"Phase 3.5 status".
|
||||||
|
- Phase 3.5 cleanup plan (style, file-bundling pattern): [2026-04-26-v3.5-phase3.5-cleanup.md](2026-04-26-v3.5-phase3.5-cleanup.md).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Pre-flight
|
||||||
|
|
||||||
|
**Branch:** create `phase-4` from the latest `main` after Phase 3.5 has merged (it has — main is at `1b66a28`):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git checkout main && git pull && git checkout -b phase-4
|
||||||
|
```
|
||||||
|
|
||||||
|
**Schema baseline:** Phase 3.5 leaves the DB at version 11. Phase 4 adds two migrations: `0012_embeddings.sql` and `0013_branches.sql`. Final schema version: 13.
|
||||||
|
|
||||||
|
**External dependency setup (BEFORE T88 dispatch):**
|
||||||
|
|
||||||
|
The controlling agent should add `sqlite-vec` to `pyproject.toml` and run `pip install -e .` (or equivalent) so all worktrees pick up the new dependency. Confirm `sqlite_vec` imports cleanly:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python -c "import sqlite_vec; print(sqlite_vec.__version__)"
|
||||||
|
```
|
||||||
|
|
||||||
|
If `sqlite_vec` isn't on PyPI when this plan executes, fall back to `sqlite-vss` and adapt T88/T92 accordingly. Both expose vector-search SQL via a loadable extension.
|
||||||
|
|
||||||
|
**Pinned non-negotiables (carried forward):**
|
||||||
|
|
||||||
|
- State changes go through the event log. Use `append_and_apply(conn, kind, payload)` for the live path; `apply_event` only after a fresh `append_event` returning the new id.
|
||||||
|
- Witness filter every memory read at SQL level (hard `WHERE` constraint; never a soft signal).
|
||||||
|
- Per-POV scene summaries — never write omniscient narration.
|
||||||
|
- TDD: every task starts with a failing test (or a regression test pinning existing contract before refactor).
|
||||||
|
- One commit per task minimum. Tasks that bundle multiple sub-features SHOULD split commits internally.
|
||||||
|
|
||||||
|
**Verification before claiming done:** Use `superpowers-extended-cc:verification-before-completion` — run the test command, paste actual output. Don't assume green.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 3.6 carry-overs folded in
|
||||||
|
|
||||||
|
Three small items from Phase 3.6 backlog are bundled into Phase 4's Wave 1 trivial-fixes task (T90):
|
||||||
|
|
||||||
|
1. `read_recent_dialogue` chat-id pushdown into SQL (T80 review nit)
|
||||||
|
2. Lifecycle warning wording in regenerate (T83.4 — "at-or-after turn X" tightening)
|
||||||
|
3. Legacy single-bot `record_turn_memory` consolidation (T84 review nit)
|
||||||
|
|
||||||
|
Three items remain DEFERRED beyond Phase 4 (Phase 4.5 if needed):
|
||||||
|
|
||||||
|
- Scene-close-on-cancel UX revisit (no action unless real play surfaces a regression).
|
||||||
|
- Cross-feature canned-queue brittleness (structured fixture builder for tests — not blocking).
|
||||||
|
- Full lifecycle-rollback in regenerate (warning log already shipped in T83.4; proper rollback needs schema-level back-references, deferred indefinitely).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Parallel-Execution Strategy
|
||||||
|
|
||||||
|
Same pattern as Phase 3.5. Eight waves: parallel within each wave (file-disjoint), serial across waves.
|
||||||
|
|
||||||
|
### How to dispatch a wave in parallel
|
||||||
|
|
||||||
|
Use the **Agent tool with `isolation: "worktree"`** so each subagent gets its own git worktree. (If the controlling session's working directory is **not** the chat repo, create worktrees manually with `git worktree add .worktrees/<wave>-<task> -b <wave>/<task> phase-4` from inside the chat repo.)
|
||||||
|
|
||||||
|
Dispatch all tasks in a wave in a single message:
|
||||||
|
|
||||||
|
```
|
||||||
|
Agent({ description: "Wave 1 — T88 embeddings table", prompt: "...", isolation: "worktree" })
|
||||||
|
Agent({ description: "Wave 1 — T89 branches table", ... })
|
||||||
|
Agent({ description: "Wave 1 — T90 phase 3.6 carry-overs", ... })
|
||||||
|
```
|
||||||
|
|
||||||
|
### After a wave completes
|
||||||
|
|
||||||
|
1. Each subagent returns its worktree path and commit SHA(s).
|
||||||
|
2. **Run a spec + code-quality reviewer subagent on each completed task.** Combined review acceptable for trivial tasks (T90 carry-overs); separate spec + quality reviewers for vector-retrieval tasks (T91, T92, T96, T97) since the integration surface is wider.
|
||||||
|
3. **Merge the wave into `phase-4`** in any order (file-disjointness guarantees no conflict). Use `--no-ff`.
|
||||||
|
4. **Run the full test suite** on the merged `phase-4`. If red, the wave's mutual-independence assumption was violated — bisect, fix, re-merge.
|
||||||
|
5. **Push `phase-4`** to gitea.
|
||||||
|
6. Optionally clean up worktrees.
|
||||||
|
|
||||||
|
### Conflict prevention checklist
|
||||||
|
|
||||||
|
For each parallel wave, verify the **Files** sections of all tasks have **no overlapping paths**. Hot files in this plan: `chat/web/drawer.py` + `chat/templates/_drawer.html` (T98 only — bundled), `chat/state/memory.py` (T96 only), `chat/services/memory_write.py` (T90 + T97 — sequential), `chat/web/turns.py` (T98 only via delete affordance — sequential after T96).
|
||||||
|
|
||||||
|
### Why each wave is parallel-safe
|
||||||
|
|
||||||
|
| Wave | Tasks | Hot files touched | Disjoint? |
|
||||||
|
|------|-------|-------------------|-----------|
|
||||||
|
| 1 | T88, T89, T90 | new migrations + new state modules; T90 touches `turn_common.py` + `regenerate.py` + `memory_write.py` (additive only) | ✅ |
|
||||||
|
| 2 | T91, T92, T93 | new service modules (embeddings, vector_search, cross_chat_search) | ✅ |
|
||||||
|
| 3 | T94, T95 | new service modules (branching, delete_impact) | ✅ |
|
||||||
|
| 4 | T96 | `chat/state/memory.py` (combined retrieval ranking) | (single task) |
|
||||||
|
| 5 | T97 | `chat/services/memory_write.py` + new backfill script | (single task) |
|
||||||
|
| 6 | T98 | `chat/web/drawer.py` + `chat/templates/_drawer.html` (drawer Phase 4 bundle) | (single task) |
|
||||||
|
| 7 | T99, T100 | new files: `chat/web/snapshots.py` + `chat/templates/snapshots.html` (T99); `chat/web/search.py` + `chat/templates/search.html` + small chat.html top-bar addition (T100) | ✅ (disjoint) |
|
||||||
|
| 8 | T101, T102 | new test file (T101); CLAUDE.md + design doc (T102) | ✅ |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task overview
|
||||||
|
|
||||||
|
```
|
||||||
|
Wave 1 ─┬─ T88: embeddings table + projector handlers
|
||||||
|
├─ T89: branches table + projector handlers
|
||||||
|
└─ T90: Phase 3.6 carry-overs trio (chat-id SQL pushdown + lifecycle wording + legacy-fn consolidation)
|
||||||
|
|
||||||
|
Wave 2 ─┬─ T91: embedding generation service (Featherless or local)
|
||||||
|
├─ T92: vector search service via sqlite-vec
|
||||||
|
└─ T93: cross-chat search service (FTS over all owners)
|
||||||
|
|
||||||
|
Wave 3 ─┬─ T94: branch_from_event service (event-log fork, branch metadata)
|
||||||
|
└─ T95: delete-impact computation service (cascade preview)
|
||||||
|
|
||||||
|
Wave 4 ─── T96: combined FTS + vector retrieval ranking in search_memories
|
||||||
|
|
||||||
|
Wave 5 ─── T97: memory_write enqueues embedding job + backfill script for existing memories
|
||||||
|
|
||||||
|
Wave 6 ─── T98: drawer Phase 4 bundle — branching UI + significance review + hide-from-view + surgical delete + remaining v1 edits
|
||||||
|
|
||||||
|
Wave 7 ─┬─ T99: snapshot UX (manual trigger, retention display, restore-from-snapshot UI)
|
||||||
|
└─ T100: cross-chat search UX (top-bar input + search results page)
|
||||||
|
|
||||||
|
Wave 8 ─┬─ T101: cross-feature integration tests (vector × branching × delete × snapshot × search)
|
||||||
|
└─ T102: Phase 4 documentation update
|
||||||
|
```
|
||||||
|
|
||||||
|
Critical path: 8 sequential merge points. Total tasks: 15. Parallelism: Waves 1, 2, 3, 7, 8 dispatch concurrently (3-way and 2-way). Waves 4, 5, 6 are single-task by hot-file constraint.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 1 — Schema foundation + Phase 3.6 carry-overs (parallel)
|
||||||
|
|
||||||
|
### Task 88: Embeddings table + projector handlers
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `chat/db/migrations/0012_embeddings.sql`
|
||||||
|
- Create: `chat/state/embeddings.py`
|
||||||
|
- Create: `tests/test_embeddings_state.py`
|
||||||
|
- Modify: `pyproject.toml` (add `sqlite-vec` dependency — controlling agent should pre-install before dispatch; the worktree commits the dependency declaration)
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
Adds the `embeddings` table that stores per-memory embedding vectors for vector retrieval. Uses `sqlite-vec` virtual-table syntax for cosine-similarity search. Schema:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Load sqlite-vec extension at connection time (handled in chat/db/connection.py).
|
||||||
|
-- Embeddings are stored as blobs in a vec0 virtual table for fast similarity search.
|
||||||
|
|
||||||
|
CREATE VIRTUAL TABLE embeddings USING vec0(
|
||||||
|
memory_id INTEGER PRIMARY KEY,
|
||||||
|
embedding FLOAT[384] -- 384-dim default; adjust per chosen model
|
||||||
|
);
|
||||||
|
|
||||||
|
-- Sidecar table for non-vector metadata (model used, dim, indexed_at).
|
||||||
|
CREATE TABLE embeddings_meta (
|
||||||
|
memory_id INTEGER PRIMARY KEY,
|
||||||
|
model TEXT NOT NULL,
|
||||||
|
dim INTEGER NOT NULL,
|
||||||
|
indexed_at TEXT NOT NULL DEFAULT (datetime('now')),
|
||||||
|
FOREIGN KEY (memory_id) REFERENCES memories(id)
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
(If `sqlite-vss` is chosen instead, replace `vec0` with `vss0` and adapt the dim declaration. Both have similar Python loading semantics.)
|
||||||
|
|
||||||
|
**`chat/state/embeddings.py`:**
|
||||||
|
|
||||||
|
- `@on("embedding_indexed")` payload `{memory_id, model, dim, vector: list[float]}`. Inserts into both `embeddings` and `embeddings_meta`. Idempotent via `INSERT OR REPLACE` (re-indexing a memory replaces the prior vector).
|
||||||
|
- `@on("embedding_deindexed")` payload `{memory_id}`. Deletes from both tables. Used when a memory is purged via reset/cascade.
|
||||||
|
- Reader `get_embedding_meta(conn, memory_id) -> dict | None` returns the meta row.
|
||||||
|
|
||||||
|
The `chat/db/connection.py` `open_db` helper needs to load the sqlite-vec extension on each connection. Add:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import sqlite_vec
|
||||||
|
# Inside open_db, after connection is opened:
|
||||||
|
conn.enable_load_extension(True)
|
||||||
|
sqlite_vec.load(conn)
|
||||||
|
conn.enable_load_extension(False)
|
||||||
|
```
|
||||||
|
|
||||||
|
This is a small modification to `connection.py`. Include it in T88's diff.
|
||||||
|
|
||||||
|
**Tests:** 3 minimum.
|
||||||
|
|
||||||
|
1. `test_embedding_indexed_inserts_row`: append `bot_authored`, `chat_created`, `memory_written` (creates a memory), then `embedding_indexed` with `vector=[0.1] * 384`. Project. Assert `embeddings_meta` row exists for that memory_id with the right model.
|
||||||
|
2. `test_embedding_deindexed_removes_row`: same setup; index then de-index; assert row is gone.
|
||||||
|
3. `test_vector_similarity_search_returns_nearest`: index two memories with distinct vectors; query for nearest neighbor of one vector; assert correct memory_id returned. Uses `sqlite-vec`'s `MATCH '...'` syntax (verify against actual sqlite-vec docs; adapt if needed).
|
||||||
|
|
||||||
|
If running tests requires sqlite-vec to be loaded, the test fixture may need to skip / xfail when the extension isn't installed. Use `pytest.importorskip("sqlite_vec")` at the top of the test file.
|
||||||
|
|
||||||
|
**Commit:** `feat: embeddings table + projector handlers via sqlite-vec (T88)`.
|
||||||
|
|
||||||
|
**Notes:**
|
||||||
|
|
||||||
|
- Schema version after migration alone: 12. T89 adds 0013, taking final to 13. The schema_version assertion in `tests/test_world.py` updates to 13 in the wave-merge step.
|
||||||
|
- The `connection.py` change is small but cross-cutting — affects every `open_db` call. Verify the existing 343 tests still pass after the change.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 89: Branches table + projector handlers
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `chat/db/migrations/0013_branches.sql`
|
||||||
|
- Create: `chat/state/branches.py`
|
||||||
|
- Create: `tests/test_branches_state.py`
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
Adds the `branches` table that records named alternate event-log forks. A branch is metadata: a name, an `origin_event_id` (the event we forked from), and a `head_event_id` (the latest event in this branch). The event log itself is unchanged — the branch table just **labels** linear ranges of event ids.
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE branches (
|
||||||
|
id INTEGER PRIMARY KEY,
|
||||||
|
name TEXT NOT NULL UNIQUE,
|
||||||
|
origin_event_id INTEGER NOT NULL,
|
||||||
|
head_event_id INTEGER NOT NULL,
|
||||||
|
chat_id TEXT,
|
||||||
|
created_at TEXT NOT NULL DEFAULT (datetime('now')),
|
||||||
|
is_active INTEGER NOT NULL DEFAULT 0
|
||||||
|
);
|
||||||
|
|
||||||
|
-- Exactly one row may have is_active = 1 at any time.
|
||||||
|
CREATE UNIQUE INDEX branches_active_idx ON branches(is_active) WHERE is_active = 1;
|
||||||
|
```
|
||||||
|
|
||||||
|
The "main" branch is implicit and bootstrapped by the migration: `INSERT INTO branches (name, origin_event_id, head_event_id, is_active) VALUES ('main', 0, 0, 1);`. Subsequent branches reference an `origin_event_id` (the event that the branch forked from).
|
||||||
|
|
||||||
|
`chat/state/branches.py`:
|
||||||
|
|
||||||
|
- `@on("branch_created")` payload `{name, origin_event_id, chat_id?, head_event_id}`. Inserts a new row with `is_active=0`. Idempotent re-insertion via `INSERT OR IGNORE`.
|
||||||
|
- `@on("branch_switched")` payload `{name}`. Sets `is_active=1` on the named branch and `is_active=0` on all others. Atomic via a single UPDATE.
|
||||||
|
- `@on("branch_head_updated")` payload `{name, head_event_id}`. Updates `head_event_id` on the named branch. Used by the orchestrator when new events extend the branch.
|
||||||
|
- Readers: `get_branch(conn, name)`, `list_branches(conn, chat_id=None)`, `active_branch(conn)`.
|
||||||
|
|
||||||
|
**Tests:** 3 minimum.
|
||||||
|
|
||||||
|
1. `test_branch_created_inserts_row`: append `branch_created` with name="experiment", origin_event_id=42; project; assert `get_branch(conn, "experiment")` returns the row.
|
||||||
|
2. `test_branch_switched_atomic`: seed two branches; switch from one to the other; assert exactly one is active.
|
||||||
|
3. `test_main_branch_bootstrapped_by_migration`: open a fresh DB, apply migrations; assert `active_branch(conn)["name"] == "main"`.
|
||||||
|
|
||||||
|
**Commit:** `feat: branches table + projector handlers (T89)`.
|
||||||
|
|
||||||
|
**Notes:**
|
||||||
|
|
||||||
|
- Schema version after this migration alone: 13. Combined with T88: 13 (since T88 was 12, T89 stacks). Wave-merge bumps `tests/test_world.py` schema_version assertion to 13.
|
||||||
|
- This task does NOT yet teach the orchestrator to consult `is_active` — the existing event_log queries assume a single timeline. T98 (drawer branching UI) will enable user-driven switches, but the actual "follow only the active branch" filter on event reads is a follow-up (Phase 4.5 nit; document in T102 docs sweep).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 90: Phase 3.6 carry-overs trio
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Modify: `chat/services/turn_common.py` (push chat_id filter into SQL)
|
||||||
|
- Modify: `chat/services/regenerate.py` (lifecycle warning wording tightening)
|
||||||
|
- Modify: `chat/services/memory_write.py` (consolidate legacy `record_turn_memory` into the unified API or delete it)
|
||||||
|
- Modify: `tests/test_turn_common.py`, `tests/test_regenerate.py`, `tests/test_memory_write.py`
|
||||||
|
|
||||||
|
**Spec:** Three small Phase 3.6 carry-over fixes bundled because each is 1-line + 1-test.
|
||||||
|
|
||||||
|
#### 90.1 — `read_recent_dialogue` chat-id SQL pushdown
|
||||||
|
|
||||||
|
Per T80 review nit. Currently `read_recent_dialogue` filters chat_id post-fetch in Python. Push into SQL for tighter LIMIT semantics:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT id, kind, payload_json
|
||||||
|
FROM event_log
|
||||||
|
WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn')
|
||||||
|
AND superseded_by IS NULL
|
||||||
|
AND hidden = 0
|
||||||
|
AND json_extract(payload_json, '$.chat_id') = ?
|
||||||
|
ORDER BY id DESC
|
||||||
|
LIMIT ?
|
||||||
|
```
|
||||||
|
|
||||||
|
Then the post-fetch loop becomes a simple reverse + slice — no chat_id check needed.
|
||||||
|
|
||||||
|
**Test added:** `test_read_recent_dialogue_limit_respects_chat_scope` — seed two chats with 60 turns each; query chat_a with `limit=50`; assert returned rows are exactly 50 chat_a rows (not 50 cross-chat rows that filter down to <50 after Python).
|
||||||
|
|
||||||
|
**Commit:** `perf: read_recent_dialogue pushes chat-id filter into SQL (T90.1)`.
|
||||||
|
|
||||||
|
#### 90.2 — Lifecycle warning wording tightening
|
||||||
|
|
||||||
|
Per T83.4 review nit. Current warning lists "lifecycle transitions from superseded turn are NOT being rolled back". When user regenerates an OLDER turn (T29 supports this), the warning lists intervening-turn transitions that legitimately stand. Tighten wording to "lifecycle transitions at-or-after turn X" so operators reading logs aren't misled.
|
||||||
|
|
||||||
|
Change is one log message string. Test asserts the new wording appears.
|
||||||
|
|
||||||
|
**Commit:** `chore: clarify regenerate lifecycle warning wording (T90.2)`.
|
||||||
|
|
||||||
|
#### 90.3 — Legacy `record_turn_memory` consolidation
|
||||||
|
|
||||||
|
Per T84 review nit. The original Phase 1 single-bot `record_turn_memory` function still exists alongside the unified `record_turn_memory_for_present`. Either:
|
||||||
|
|
||||||
|
- (a) Remove the legacy function entirely; update any remaining callers to use the unified API.
|
||||||
|
- (b) Convert it to a thin wrapper for backward compat.
|
||||||
|
|
||||||
|
Pick (a) if there are zero remaining callers; (b) if any callers exist. Read the codebase to confirm. The mock-data seed scripts may still use the legacy fn.
|
||||||
|
|
||||||
|
**Commit:** `refactor: consolidate legacy record_turn_memory into unified API (T90.3)`.
|
||||||
|
|
||||||
|
**TDD process for T90:**
|
||||||
|
|
||||||
|
1. Read all 3 affected files + their tests.
|
||||||
|
2. Implement 90.1 with test; commit.
|
||||||
|
3. Implement 90.2 with test; commit.
|
||||||
|
4. Implement 90.3 with test; commit.
|
||||||
|
5. Run full suite — should be 343 + 3 = 346 (or +2 if 90.3 had no behavioral change).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 2 — Embedding & search services (parallel)
|
||||||
|
|
||||||
|
Three new service modules. Fully file-disjoint.
|
||||||
|
|
||||||
|
### Task 91: Embedding generation service
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `chat/services/embeddings.py`
|
||||||
|
- Create: `tests/test_embeddings.py`
|
||||||
|
|
||||||
|
**Spec:** Wraps the embedding API call. Signature:
|
||||||
|
|
||||||
|
```python
|
||||||
|
class EmbeddingResult(BaseModel):
|
||||||
|
vector: list[float]
|
||||||
|
model: str
|
||||||
|
dim: int
|
||||||
|
|
||||||
|
async def generate_embedding(
|
||||||
|
client: LLMClient, # or a separate embedding-specific client
|
||||||
|
*,
|
||||||
|
text: str,
|
||||||
|
model: str,
|
||||||
|
timeout_s: float = 30.0,
|
||||||
|
) -> EmbeddingResult:
|
||||||
|
"""Generate an embedding vector for the given text. Falls back to a
|
||||||
|
zero-vector with model='fallback' on failure (so callers get a deterministic
|
||||||
|
sentinel they can detect and skip indexing)."""
|
||||||
|
```
|
||||||
|
|
||||||
|
**Implementation:** call the embedding endpoint (Featherless OpenAI-compatible `/v1/embeddings`, or a local `sentence-transformers` model). Add a new method `client.embed(text, model)` to `LLMClient` Protocol (and to `MockLLMClient` and `FeatherlessClient`).
|
||||||
|
|
||||||
|
**Embedding model choice:**
|
||||||
|
|
||||||
|
Default to a small CPU-friendly model accessible through the existing Featherless setup:
|
||||||
|
|
||||||
|
- If Featherless has `BAAI/bge-small-en-v1.5` or similar 384-dim model: use that.
|
||||||
|
- If not: fall back to local `sentence-transformers/all-MiniLM-L6-v2` (384-dim, runs CPU). Add `sentence-transformers` to `pyproject.toml`.
|
||||||
|
- Document choice in CLAUDE.md (T102 docs sweep).
|
||||||
|
|
||||||
|
The 384 dim is hardcoded in T88's migration. If a different model with different dim is chosen, update T88's schema accordingly BEFORE T88 dispatches.
|
||||||
|
|
||||||
|
**Tests:** 3 minimum.
|
||||||
|
|
||||||
|
1. `test_generate_embedding_returns_vector_of_correct_dim`: mock embedding response with a 384-element vector; assert returned `vector` length is 384.
|
||||||
|
2. `test_generate_embedding_returns_correct_model_metadata`: assert `result.model` matches the input.
|
||||||
|
3. `test_generate_embedding_falls_back_on_failure`: mock the client to raise; assert the result is a 384-element zero vector with `model="fallback"`.
|
||||||
|
|
||||||
|
**Commit:** `feat: embedding generation service (T91)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 92: Vector search service via sqlite-vec
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `chat/services/vector_search.py`
|
||||||
|
- Create: `tests/test_vector_search.py`
|
||||||
|
|
||||||
|
**Spec:** Wraps sqlite-vec's `MATCH` syntax for cosine-similarity search over the `embeddings` virtual table. Witness-filter aware (joins through `memories` table for the witness check).
|
||||||
|
|
||||||
|
```python
|
||||||
|
def vector_search(
|
||||||
|
conn,
|
||||||
|
*,
|
||||||
|
owner_id: str,
|
||||||
|
witness_role: str, # "you" | "host" | "guest"
|
||||||
|
query_vector: list[float],
|
||||||
|
k: int = 4,
|
||||||
|
) -> list[dict]:
|
||||||
|
"""Return top-K memories by cosine similarity to query_vector,
|
||||||
|
witness-filtered for the requesting bot's POV. Returns same row
|
||||||
|
shape as state.memory.search_memories for combined-ranking
|
||||||
|
compatibility."""
|
||||||
|
```
|
||||||
|
|
||||||
|
SQL pattern (sqlite-vec):
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT m.id, m.text, m.pov_summary, m.significance, e.distance
|
||||||
|
FROM embeddings e
|
||||||
|
JOIN memories m ON m.id = e.memory_id
|
||||||
|
WHERE e.embedding MATCH ?
|
||||||
|
AND k = ?
|
||||||
|
AND m.owner_id = ?
|
||||||
|
AND m.witness_<role> = 1
|
||||||
|
ORDER BY e.distance ASC
|
||||||
|
LIMIT ?
|
||||||
|
```
|
||||||
|
|
||||||
|
(Adapt to actual sqlite-vec syntax — use `vec0` MATCH semantics. The `witness_<role>` interpolation needs the same allowlist guard pattern as Phase 2.5 T72.3.)
|
||||||
|
|
||||||
|
**Tests:** 3 minimum.
|
||||||
|
|
||||||
|
1. `test_vector_search_returns_nearest_neighbors`: index 5 memories with synthetic vectors; query for nearest 3; assert correct order.
|
||||||
|
2. `test_vector_search_respects_witness_filter`: index a memory with witness `[1, 1, 0]`; query with `witness_role="guest"`; assert empty result.
|
||||||
|
3. `test_vector_search_respects_owner_filter`: index memories for two owners; assert query for owner_a doesn't return owner_b's memories.
|
||||||
|
|
||||||
|
**Commit:** `feat: vector search service via sqlite-vec (T92)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 93: Cross-chat search service
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `chat/services/cross_chat_search.py`
|
||||||
|
- Create: `tests/test_cross_chat_search.py`
|
||||||
|
|
||||||
|
**Spec:** FTS5-based search across ALL chats and all owners (admin-style search; no witness filter). For "where did I last see this person mention X?" queries.
|
||||||
|
|
||||||
|
```python
|
||||||
|
def search_all_memories(
|
||||||
|
conn,
|
||||||
|
*,
|
||||||
|
query: str,
|
||||||
|
k: int = 20,
|
||||||
|
) -> list[dict]:
|
||||||
|
"""Search FTS across all owners and chats. Returns rows with
|
||||||
|
{memory_id, owner_id, chat_id, text, pov_summary, scene_id,
|
||||||
|
significance, ts}. Sorted by FTS rank."""
|
||||||
|
```
|
||||||
|
|
||||||
|
This is intentionally NOT witness-filtered — it's a power-user search surface. The UI (T100) prompts the user to acknowledge they're seeing memories across POVs.
|
||||||
|
|
||||||
|
**Tests:** 3 minimum.
|
||||||
|
|
||||||
|
1. `test_search_all_memories_returns_matches_across_owners`: seed 2 owners with overlapping keyword; search; assert both owner's matches appear.
|
||||||
|
2. `test_search_all_memories_orders_by_fts_rank`: seed memories with varying FTS-match strength; assert order.
|
||||||
|
3. `test_search_all_memories_respects_k_limit`.
|
||||||
|
|
||||||
|
**Commit:** `feat: cross-chat search service (FTS5 over all owners) (T93)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 3 — Branching + delete services (parallel)
|
||||||
|
|
||||||
|
Two new service modules. Fully file-disjoint.
|
||||||
|
|
||||||
|
### Task 94: branch_from_event service
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `chat/services/branching.py`
|
||||||
|
- Create: `tests/test_branching.py`
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
def branch_from_event(
|
||||||
|
conn,
|
||||||
|
*,
|
||||||
|
name: str,
|
||||||
|
origin_event_id: int,
|
||||||
|
chat_id: str | None = None,
|
||||||
|
) -> int:
|
||||||
|
"""Create a new named branch forking from origin_event_id.
|
||||||
|
Emits a branch_created event. Returns the new branch's row id.
|
||||||
|
Raises ValueError if name already exists."""
|
||||||
|
|
||||||
|
def switch_active_branch(conn, *, name: str) -> None:
|
||||||
|
"""Make the named branch active. Emits branch_switched. Subsequent
|
||||||
|
event reads should consult is_active to filter."""
|
||||||
|
|
||||||
|
def list_branches_with_metadata(conn, chat_id: str | None = None) -> list[dict]:
|
||||||
|
"""List branches with: name, origin_event_id, head_event_id, is_active,
|
||||||
|
event_count (number of events between origin and head, inclusive),
|
||||||
|
created_at."""
|
||||||
|
```
|
||||||
|
|
||||||
|
Tests cover: basic create, duplicate-name raises, switch updates `is_active` exclusively, list returns metadata.
|
||||||
|
|
||||||
|
**Commit:** `feat: branching service (T94)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 95: Delete-impact computation service
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `chat/services/delete_impact.py`
|
||||||
|
- Create: `tests/test_delete_impact.py`
|
||||||
|
|
||||||
|
**Spec:** Computes the cascade impact of deleting a single event_log row (or a turn group: user_turn + assistant_turn + interjection if any). Returns a structured `ImpactReport` for the UI to render.
|
||||||
|
|
||||||
|
```python
|
||||||
|
class DeletedItem(BaseModel):
|
||||||
|
kind: str # "memory" | "edge_update" | "scene_close" | etc.
|
||||||
|
description: str # human-readable
|
||||||
|
target_id: int | str | None
|
||||||
|
|
||||||
|
class ImpactReport(BaseModel):
|
||||||
|
target_event_id: int
|
||||||
|
cascading: list[DeletedItem]
|
||||||
|
notes: list[str] # warnings, e.g. "this turn opened scene_X which has 3 subsequent turns"
|
||||||
|
|
||||||
|
def compute_delete_impact(conn, *, target_event_id: int) -> ImpactReport:
|
||||||
|
"""Walk the event log forward from target_event_id and identify
|
||||||
|
everything that depends on this event: child memory_written events,
|
||||||
|
edge_update events with this turn as source, scene_closed events
|
||||||
|
triggered by this turn, etc. Also identify subsequent turns that
|
||||||
|
REFERENCE this event (regenerated_from chains, etc.).
|
||||||
|
|
||||||
|
Does NOT mutate the database. Pure computation for preview."""
|
||||||
|
```
|
||||||
|
|
||||||
|
The actual delete (truncate + supersede) is the existing rewind path from Phase 1 T31. T95 just builds the preview.
|
||||||
|
|
||||||
|
**Tests:** 4 minimum.
|
||||||
|
|
||||||
|
1. `test_impact_for_simple_turn_lists_memory_and_edges`: seed a chat with a turn that wrote 1 memory + 2 edge_updates. Compute impact. Assert the 3 items appear in `cascading`.
|
||||||
|
2. `test_impact_for_scene_opening_turn_warns_about_subsequent_turns`: seed a turn that opened a scene + 5 subsequent turns. Assert `notes` mentions the dependency.
|
||||||
|
3. `test_impact_for_regenerated_turn_lists_supersede_chain`: seed a turn that's been regenerated (has `superseded_by`). Compute impact for the original. Assert the chain appears.
|
||||||
|
4. `test_impact_does_not_mutate_database`: snapshot event_log before + after; assert byte-identical.
|
||||||
|
|
||||||
|
**Commit:** `feat: delete-impact computation service (T95)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 4 — Combined retrieval ranking (single)
|
||||||
|
|
||||||
|
### Task 96: Combined FTS + vector retrieval ranking
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Modify: `chat/state/memory.py` — extend `search_memories` to optionally include vector hits
|
||||||
|
- Modify: `tests/test_memory_search.py` — add 4 tests
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
`search_memories` currently does FTS5 + Python-side significance/recency re-rank. Phase 4 adds:
|
||||||
|
|
||||||
|
- An optional `query_vector: list[float] | None = None` kwarg.
|
||||||
|
- When `query_vector` is provided, run `vector_search` (T92) for top-K-vector candidates.
|
||||||
|
- Merge with FTS top-K candidates via reciprocal-rank fusion (RRF) or a simpler sum-of-ranks scheme — implementer's choice. Document the merge formula.
|
||||||
|
- Final result is top-K from the fused set, with the existing significance + recency boosts applied as a final pass.
|
||||||
|
|
||||||
|
When `query_vector` is None: existing behavior unchanged. Phase 1/2/3 callers that don't pass `query_vector` see no change.
|
||||||
|
|
||||||
|
**Implementation note:** the embedding for the query (the speaker's recent context) must be generated by the caller (Wave 5 T97 wires the prompt-assembly pipeline to call `generate_embedding` on the dialogue tail). T96 only handles the search side — assumes the vector is pre-computed.
|
||||||
|
|
||||||
|
**Tests:** 4 added.
|
||||||
|
|
||||||
|
1. `test_search_memories_without_query_vector_uses_fts_only`: regression — call without `query_vector`; assert the existing FTS+rerank behavior.
|
||||||
|
2. `test_search_memories_with_query_vector_includes_vector_hits`: index 5 memories where 1 is FTS-only-matching, 1 is vector-only-matching, 3 are unrelated. Pass both `query=...` and `query_vector=...`. Assert both the FTS hit and the vector hit appear in results.
|
||||||
|
3. `test_search_memories_fusion_significance_bias_still_applies`: confirm the existing significance bias rerank still works on top of fused results.
|
||||||
|
4. `test_search_memories_fusion_handles_empty_vector_results`: pass a vector for a memory that has no embeddings indexed; assert FTS-only results still come back.
|
||||||
|
|
||||||
|
**Commit:** `feat: combined FTS + vector retrieval ranking (T96)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 5 — Memory write hook + backfill (single)
|
||||||
|
|
||||||
|
### Task 97: Embedding generation hook + backfill script
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Modify: `chat/services/memory_write.py` — after each `memory_written` event, enqueue a background embedding job
|
||||||
|
- Create: `chat/services/embedding_worker.py` — async worker that consumes the queue and emits `embedding_indexed` events
|
||||||
|
- Create: `scripts/backfill_embeddings.py` — one-time script that walks all existing memories and embeds them
|
||||||
|
- Modify: `chat/app.py` — wire the embedding worker into the lifespan startup
|
||||||
|
- Modify: `tests/test_memory_write.py` — add 2 tests for the enqueue hook
|
||||||
|
- Create: `tests/test_embedding_worker.py` — 3 tests for the worker drain logic
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
After each successful `memory_written` event, enqueue an embedding job. The worker dequeues and:
|
||||||
|
|
||||||
|
1. Reads the memory text (via `get_memory(conn, memory_id)`).
|
||||||
|
2. Calls `generate_embedding(client, text=memory.text, model=settings.embedding_model)`.
|
||||||
|
3. Appends `embedding_indexed` event with the result. (Skip if `result.model == "fallback"` — leave the memory un-indexed; will retry later via backfill.)
|
||||||
|
|
||||||
|
The worker pattern mirrors Phase 1's `chat/services/significance.py` SignificanceWorker. Reuse its queue + lifecycle pattern.
|
||||||
|
|
||||||
|
**Backfill script:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python scripts/backfill_embeddings.py [--limit N] [--dry-run]
|
||||||
|
```
|
||||||
|
|
||||||
|
Walks all memories where no `embeddings_meta` row exists. For each, generates an embedding and emits `embedding_indexed`. Useful for the initial migration after Phase 4 lands AND for periodic re-runs if an embedding model changes.
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
|
||||||
|
`tests/test_memory_write.py`:
|
||||||
|
1. `test_record_turn_memory_enqueues_embedding_job`: monkeypatch the worker's enqueue method; record_turn_memory_for_present; assert the worker received a job per memory.
|
||||||
|
|
||||||
|
`tests/test_embedding_worker.py`:
|
||||||
|
1. `test_worker_drains_jobs_and_emits_indexed_events`: enqueue 3 jobs with mock embeddings; run worker; assert 3 `embedding_indexed` events landed.
|
||||||
|
2. `test_worker_skips_fallback_results`: mock the embedding service to return a fallback result; assert NO `embedding_indexed` event landed for that job.
|
||||||
|
3. `test_worker_handles_concurrent_jobs_serially`: pin the Featherless 2-conn cap behavior (worker calls embed sequentially under the existing semaphore).
|
||||||
|
|
||||||
|
**Commit (split):**
|
||||||
|
|
||||||
|
- `feat: embedding worker drains queue and emits embedding_indexed events (T97.1)`
|
||||||
|
- `feat: memory_write enqueues embedding job after each memory_written (T97.2)`
|
||||||
|
- `feat: backfill_embeddings script for existing memories (T97.3)`
|
||||||
|
|
||||||
|
**Verification gates:**
|
||||||
|
|
||||||
|
- All Phase 1/2/3/3.5 memory tests still pass (regression critical).
|
||||||
|
- New tests pass.
|
||||||
|
- Manual smoke: run `scripts/backfill_embeddings.py --dry-run` against a seeded DB and verify expected count.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 6 — Drawer Phase 4 bundle (single task)
|
||||||
|
|
||||||
|
### Task 98: Drawer Phase 4 features
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Modify: `chat/web/drawer.py` (add many new POST routes and GET extensions)
|
||||||
|
- Modify: `chat/templates/_drawer.html` (add 5 new sections)
|
||||||
|
- Create: `tests/test_drawer_phase4.py`
|
||||||
|
|
||||||
|
**Spec:** Drawer affordances for 5 Phase 4 features. Single task by hot-file constraint; split into 5 commits internally.
|
||||||
|
|
||||||
|
#### 98.1 — Branching UI
|
||||||
|
|
||||||
|
GET drawer extension: `list_branches_with_metadata(conn)` → render in a "Branches" section (active branch highlighted + count of events).
|
||||||
|
|
||||||
|
POST routes:
|
||||||
|
- `/drawer/branch/create` — form `{name, origin_event_id}` → `branch_from_event` service.
|
||||||
|
- `/drawer/branch/switch` — form `{name}` → `switch_active_branch`.
|
||||||
|
- `/drawer/branch/from-turn/{event_id}` — convenience: branch from a specific turn (used by per-turn UI affordance).
|
||||||
|
|
||||||
|
#### 98.2 — Significance review panel
|
||||||
|
|
||||||
|
GET extension: significance distribution per chat (`SELECT significance, COUNT(*) GROUP BY significance`) → render histogram.
|
||||||
|
|
||||||
|
POST route:
|
||||||
|
- `/drawer/memory/significance/{memory_id}` — form `{new_value}` (already supported via T22 `manual_edit` `target_kind=memory_significance`); just add the UI form.
|
||||||
|
|
||||||
|
Bulk re-rate is a Phase 4.5 polish — not in scope here. Just per-memory edit + distribution display.
|
||||||
|
|
||||||
|
#### 98.3 — Hide-from-view toggle
|
||||||
|
|
||||||
|
POST route:
|
||||||
|
- `/drawer/turn/hide/{event_id}` — form `{hidden: bool}` → emits a `manual_edit` with `target_kind="turn_hidden"`.
|
||||||
|
|
||||||
|
NEW `manual_edit` projector branch for `turn_hidden`: sets `event_log.hidden = ?` for the target event. Reuses the existing `hidden` column.
|
||||||
|
|
||||||
|
UI affordance: per-turn checkbox in the chat surface or drawer (per-turn list with hide toggle).
|
||||||
|
|
||||||
|
#### 98.4 — Surgical delete with cascade preview
|
||||||
|
|
||||||
|
GET extension:
|
||||||
|
- `/drawer/turn/delete-preview/{event_id}` → returns the `ImpactReport` (T95) rendered as a modal.
|
||||||
|
|
||||||
|
POST route:
|
||||||
|
- `/drawer/turn/delete/{event_id}` — invokes the rewind-and-truncate path (Phase 1 T31's `rewind_to_turn`) restricted to the target turn group.
|
||||||
|
|
||||||
|
Important: this reuses the existing pre-rewind snapshot path so the action is undoable.
|
||||||
|
|
||||||
|
#### 98.5 — Remaining v1 edits
|
||||||
|
|
||||||
|
Audit: are any v1 fields STILL not editable from the drawer? Phase 2.5 T72.1 added edge_trust/edge_summary/memory_pov_summary/edge_knowledge_facts. T72.3 added witness flags. Anything left?
|
||||||
|
|
||||||
|
Likely candidates: scene `narrative_anchor`, scene `weather`, container `properties` JSON. Add edit forms for any that surface during the audit. If none, this sub-fix is a no-op.
|
||||||
|
|
||||||
|
**Tests:** 8+ in `tests/test_drawer_phase4.py` (one per sub-feature × happy path; plus 1 for the cascade-preview rendering).
|
||||||
|
|
||||||
|
**Commits (5):**
|
||||||
|
|
||||||
|
- `feat: drawer branching UI (T98.1)`
|
||||||
|
- `feat: drawer significance review panel (T98.2)`
|
||||||
|
- `feat: drawer hide-from-view toggle + manual_edit turn_hidden branch (T98.3)`
|
||||||
|
- `feat: drawer surgical delete with cascade preview (T98.4)`
|
||||||
|
- `feat: drawer remaining v1 field edits (T98.5)` (or "no-op audit" if nothing left)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 7 — Snapshot + cross-chat search UX (parallel)
|
||||||
|
|
||||||
|
### Task 99: Snapshot UX
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `chat/web/snapshots.py` (new route module)
|
||||||
|
- Create: `chat/templates/snapshots.html` (snapshot list page)
|
||||||
|
- Modify: `chat/templates/layout.html` (add "Snapshots" nav link)
|
||||||
|
- Create: `tests/test_snapshot_ux.py`
|
||||||
|
|
||||||
|
**Spec:** Surface the existing snapshot infrastructure (Phase 1 T20 wrote snapshots; Phase 4 makes them visible).
|
||||||
|
|
||||||
|
GET `/snapshots` — list all snapshots (periodic + pre-rewind) with metadata: kind, created_at, event_log_size, file_size_bytes.
|
||||||
|
|
||||||
|
POST `/snapshots/take` — manually trigger a snapshot now.
|
||||||
|
|
||||||
|
POST `/snapshots/restore/{snapshot_id}` — restore from snapshot (with hard confirmation).
|
||||||
|
|
||||||
|
GET `/snapshots/{snapshot_id}/preview` — show what's in the snapshot vs. current state.
|
||||||
|
|
||||||
|
**Tests:** 4 minimum (list, take, restore, preview).
|
||||||
|
|
||||||
|
**Commit:** `feat: snapshot UX (manual trigger, list, restore) (T99)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 100: Cross-chat search UX
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `chat/web/search.py` (new route module)
|
||||||
|
- Create: `chat/templates/search.html` (search results page)
|
||||||
|
- Modify: `chat/templates/layout.html` (add top-bar search input)
|
||||||
|
- Create: `tests/test_search_ux.py`
|
||||||
|
|
||||||
|
**Spec:** Top-bar search box submits to `/search?q=...`. Results page shows up to 50 matches across all chats and all owners (uses T93's `search_all_memories`). Each result shows: chat name, owner bot name, scene context, memory text excerpt with FTS highlight, "Open chat at this turn" link.
|
||||||
|
|
||||||
|
**Tests:** 3 minimum.
|
||||||
|
1. Search returns results from multiple chats.
|
||||||
|
2. Empty query returns empty result set.
|
||||||
|
3. Result links navigate to the right chat anchor.
|
||||||
|
|
||||||
|
**Commit:** `feat: cross-chat search UX (top-bar input + results page) (T100)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 8 — Polish (parallel)
|
||||||
|
|
||||||
|
### Task 101: Cross-feature integration tests
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `tests/test_phase4_integration.py`
|
||||||
|
|
||||||
|
**Spec:** End-to-end multi-feature flows. 5 tests minimum.
|
||||||
|
|
||||||
|
1. **Vector retrieval feedback loop**: write a memory → embedding worker indexes it → search retrieves it via vector path.
|
||||||
|
2. **Branch + diverge**: create branch B from turn 10 → switch to B → play 3 new turns → switch back to main → assert main's turn 11+ are still intact.
|
||||||
|
3. **Surgical delete**: compute impact for a turn → confirm → assert event log truncated correctly + pre-rewind snapshot saved.
|
||||||
|
4. **Hide + retrieval**: hide a turn → assert it doesn't appear in `read_recent_dialogue` (existing `hidden = 0` filter) → unhide → assert it reappears.
|
||||||
|
5. **Cross-chat search**: write memories in 3 chats → search for keyword present in all 3 → assert all 3 appear in results.
|
||||||
|
|
||||||
|
**Commit:** `test: phase 4 cross-feature integration coverage (T101)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 102: Phase 4 documentation update
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Modify: `CLAUDE.md` (add "Phase 4 status" section; update behavioral defaults; add "Phase 4.5 / 5 backlog" with carry-overs)
|
||||||
|
- Modify: `docs/plans/2026-04-26-v1-requirements-design.md` (annotate §13 Phase 4 as **Status: shipped 2026-04-27**)
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
Mirror the Phase 3 / 3.5 status sections. Document:
|
||||||
|
|
||||||
|
- **Vector retrieval**: sqlite-vec virtual table, embedding worker async pipeline, combined FTS + vector ranking via RRF.
|
||||||
|
- **Branching**: forks the event log; UI in drawer; `is_active` flag plus orchestrator filter (caveat — see backlog if filter not yet wired into all readers).
|
||||||
|
- **Drawer-edit on every field**: branching, significance review, hide-from-view, surgical delete with preview, plus any audit findings.
|
||||||
|
- **Backup tooling**: snapshots panel surfaces existing infra.
|
||||||
|
- **Significance review UI**: distribution + per-memory edit.
|
||||||
|
- **Surgical delete + cascade preview**: piggybacks on rewind path; impact report from T95.
|
||||||
|
- **Hide-from-view soft delete**: `manual_edit` `turn_hidden` branch.
|
||||||
|
- **Cross-chat search**: top-bar + results page over T93's service.
|
||||||
|
|
||||||
|
**Phase 4.5 / 5 backlog candidates** (reflect any discovered during execution):
|
||||||
|
|
||||||
|
- Branching read-side filter — if T89's `is_active` isn't yet consulted by every event reader, this is the work to do.
|
||||||
|
- Bulk significance re-rate (per T98.2 deferral).
|
||||||
|
- Snapshot retention policy UI controls (per Phase 1 T19 deferred).
|
||||||
|
- Auto-pin override UI (per Phase 2 design).
|
||||||
|
- Embedding model swap migration tooling (when changing embedding model, need to re-embed everything).
|
||||||
|
- Vector index optimization (HNSW vs flat — Phase 5 if needed).
|
||||||
|
- Carry-overs that remained deferred from Phase 3.6: scene-close-on-cancel UX revisit, canned-queue brittleness fixture builder, full lifecycle rollback in regenerate.
|
||||||
|
|
||||||
|
**Commit:** `docs: phase 4 status, behavioral defaults, deferred items (T102)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wrap-up
|
||||||
|
|
||||||
|
After Wave 8 lands:
|
||||||
|
|
||||||
|
1. **Run full suite** on `phase-4`: should be ~390+ tests passing (343 from Phase 3.5 + ~50 new).
|
||||||
|
2. **Manual smoke** (recommended before opening the PR):
|
||||||
|
- Run `scripts/backfill_embeddings.py` against a seeded DB to verify vector indexing works.
|
||||||
|
- Search for a phrase that's substring-distinct but semantically similar to a memory; verify vector path returns it (FTS would miss).
|
||||||
|
- Create a branch from an old turn; switch; play a few turns; switch back.
|
||||||
|
- Trigger surgical delete on a turn; verify the impact preview matches what actually gets removed.
|
||||||
|
- Hide a turn; verify it disappears from the chat surface; unhide.
|
||||||
|
- Use top-bar search to find a phrase; verify cross-chat results appear.
|
||||||
|
- Click the "Snapshots" nav link; trigger a manual snapshot; verify it appears.
|
||||||
|
3. **Push `phase-4`** to gitea.
|
||||||
|
4. **Open PR** `phase-4 → main`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Notes for the controller running this plan
|
||||||
|
|
||||||
|
- **External dependency**: `sqlite-vec` (or `sqlite-vss`) MUST be added to `pyproject.toml` and installed BEFORE Wave 1 dispatches. The migration in T88 expects the extension to be loadable.
|
||||||
|
- **Embedding model choice**: pin in T91 spec before dispatch. The 384 dim is hardcoded in T88's migration; if a different dim is used, update T88 first.
|
||||||
|
- **After each parallel wave**, run a code-review subagent. Combined spec+quality acceptable for trivial tasks (T90 carry-overs); separate spec + quality reviewers for vector-retrieval and integration tasks (T91, T96, T97, T98, T101) — surface area is larger.
|
||||||
|
- **Don't dispatch Wave 5 until Wave 4 merged green.** T97 (memory_write enqueue) calls into the embedding-aware worker; the worker uses T91's `generate_embedding`. Both must be merged into `phase-4` first.
|
||||||
|
- **Don't dispatch Wave 6 until Wave 5 merged green.** T98 (drawer) wires UI affordances over services from earlier waves.
|
||||||
|
- **Token-spend rough estimate**: Phase 4 should be ~70-80% the size of Phase 3 (similar scope, larger per-task because vector + branching are non-trivial). Per-task spend similar to Phase 3's larger tasks (T59, T64).
|
||||||
|
- **DO NOT break existing v1/v2/v3/v3.5 surface contracts.** Every test file that was green at the start of Phase 4 must stay green at the end. The cross-feature integration tests from Phase 3 (`tests/test_phase3_integration.py`) are particularly load-bearing.
|
||||||
@@ -0,0 +1,22 @@
|
|||||||
|
{
|
||||||
|
"planPath": "docs/plans/2026-04-27-v4-phase4-implementation.md",
|
||||||
|
"tasks": [
|
||||||
|
{"id": 88, "subject": "T88: embeddings table + projector handlers (sqlite-vec)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||||
|
{"id": 89, "subject": "T89: branches table + projector handlers", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||||
|
{"id": 90, "subject": "T90: phase 3.6 carry-overs (chat-id pushdown + lifecycle wording + legacy fn consolidation)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||||
|
{"id": 91, "subject": "T91: embedding generation service", "status": "pending", "wave": 2, "parallelGroup": "wave-2", "blockedBy": [88]},
|
||||||
|
{"id": 92, "subject": "T92: vector search service via sqlite-vec", "status": "pending", "wave": 2, "parallelGroup": "wave-2", "blockedBy": [88]},
|
||||||
|
{"id": 93, "subject": "T93: cross-chat search service (FTS5 over all owners)", "status": "pending", "wave": 2, "parallelGroup": "wave-2"},
|
||||||
|
{"id": 94, "subject": "T94: branch_from_event service", "status": "pending", "wave": 3, "parallelGroup": "wave-3", "blockedBy": [89]},
|
||||||
|
{"id": 95, "subject": "T95: delete-impact computation service", "status": "pending", "wave": 3, "parallelGroup": "wave-3"},
|
||||||
|
{"id": 96, "subject": "T96: combined FTS + vector retrieval ranking in search_memories", "status": "pending", "wave": 4, "parallelGroup": null, "blockedBy": [91, 92]},
|
||||||
|
{"id": 97, "subject": "T97: memory_write enqueues embedding job + backfill script", "status": "pending", "wave": 5, "parallelGroup": null, "blockedBy": [91, 96]},
|
||||||
|
{"id": 98, "subject": "T98: drawer Phase 4 bundle (branching + sig review + hide + surgical delete + remaining edits)", "status": "pending", "wave": 6, "parallelGroup": null, "blockedBy": [94, 95, 97]},
|
||||||
|
{"id": 99, "subject": "T99: snapshot UX (manual trigger + list + restore + preview)", "status": "pending", "wave": 7, "parallelGroup": "wave-7"},
|
||||||
|
{"id": 100, "subject": "T100: cross-chat search UX (top-bar + results page)", "status": "pending", "wave": 7, "parallelGroup": "wave-7", "blockedBy": [93]},
|
||||||
|
{"id": 101, "subject": "T101: cross-feature integration tests (vector × branching × delete × snapshot × search)", "status": "pending", "wave": 8, "parallelGroup": "wave-8", "blockedBy": [98, 99, 100]},
|
||||||
|
{"id": 102, "subject": "T102: Phase 4 documentation update", "status": "pending", "wave": 8, "parallelGroup": "wave-8", "blockedBy": [98, 99, 100]}
|
||||||
|
],
|
||||||
|
"lastUpdated": "2026-04-27T00:00:00Z",
|
||||||
|
"notes": "15 tasks across 8 waves. Adds vector retrieval (sqlite-vec), branching UI, drawer-edit on every field, backup tooling, significance review UI, surgical delete with cascade preview, hide-from-view, and cross-chat search. Phase 3.6 carry-overs (3 small fixes) bundled into T90. External dependency: sqlite-vec must be installed BEFORE Wave 1 dispatch. Embedding model choice (default: 384-dim small model) pinned in T91 spec before dispatch — schema 0012 hardcodes 384 dim. Two new schema migrations (0012 embeddings, 0013 branches), final schema version 13. Uses task ids T88-T102."
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user