docs: phase 4 status, behavioral defaults, deferred items (T102)

This commit is contained in:
Joseph Doherty
2026-04-27 03:56:45 -04:00
parent 3b4c7b9cef
commit b6119879e5
2 changed files with 87 additions and 0 deletions
+85
View File
@@ -287,3 +287,88 @@ New follow-ups discovered during Phase 3.5 reviews and execution. None are block
- **Scene-close-on-cancel UX revisit** (Phase 2.5 carry-over): T74.3 pinned the existing behavior; revisit if real play-testing surfaces a regression.
- **Cross-feature canned-queue brittleness**: meanwhile-scene close test required a canned response for T65's digest call after T64+T65 merge. Future close-path additions will keep extending the queue. Consider a structured fixture builder rather than positional canned arrays. NOT addressed in Phase 3.5.
- **Lifecycle-transition rollback in regenerate**: T83.4 added a warning log; actual rollback (with proper schema linkage from lifecycle event back to producing turn) is Phase 4 work.
## Phase 4 status
Phase 4 polish shipped end-to-end across 15 tasks (T88T102). Vector retrieval is functional via pure-Python cosine over a JSON-blob embeddings table (sqlite-vec deferred — host Python lacks loadable extensions). Branching is data-model + drawer UI. Surgical delete with cascade preview, hide-from-view soft delete, significance review panel, snapshot UX, and cross-chat search all surface from the drawer or top-bar. Test count grew from 343 (Phase 3.5) to ~413 (+70 new tests).
- **Wave 1 — schema + Phase 3.6 carry-overs (parallel)**:
- **T88** `embeddings` table + projector handlers (pure-Python cosine, JSON-blob storage; sqlite-vec deferred).
- **T89** `branches` table + handlers (main bootstrapped; `is_active` flag; partial unique index).
- **T90** Phase 3.6 carry-overs trio — `read_recent_dialogue` chat-id SQL pushdown, lifecycle warning wording tightening, legacy `record_turn_memory` removed.
- **Wave 2 — services (parallel)**:
- **T91** embedding generation service (Phase 4 ships a deterministic SHA-256-derived pseudo-embedding; real model swap is Phase 4.5+).
- **T92** vector search service via pure-Python cosine.
- **T93** cross-chat search service (FTS5 across all owners, no witness filter — admin-style).
- **Wave 3 — services (parallel)**:
- **T94** branching service (`branch_from_event`, `switch_active_branch`, `list_branches_with_metadata`).
- **T95** delete-impact computation service (cascade preview, no DB mutation).
- **Wave 4 — combined retrieval (single)**:
- **T96** combined FTS + vector retrieval ranking via reciprocal-rank fusion (RRF, `RRF_CONST=60`); existing significance/recency boost applied as final pass.
- **Wave 5 — memory write hook + backfill (single)**:
- **T97** `EmbeddingWorker` drains queue and emits `embedding_indexed` events; `memory_write` enqueues per `memory_written`; `backfill_embeddings` script for existing memories; ALL 4 production call sites wired (turns, regenerate, meanwhile, drawer).
- **Wave 6 — drawer Phase 4 bundle (single, 5 sub-features)**:
- **T98.1** branching UI (Branches panel + 3 routes).
- **T98.2** significance review panel (distribution bar chart + per-memory edit).
- **T98.3** hide-from-view toggle + `turn_hidden` `manual_edit` branch.
- **T98.4** surgical delete with cascade preview (reuses existing rewind path; pre-rewind snapshot preserved).
- **T98.5** remaining v1 edits — `narrative_anchor` + weather drawer affordances + 2 new `manual_edit` branches.
- **Wave 7 — UX surfaces (parallel)**:
- **T99** snapshot UX (manual trigger, list, restore with hard-confirm, preview).
- **T100** cross-chat search UX (top-bar form + results page).
- **Wave 8 — polish (parallel)**:
- **T101** cross-feature integration tests (5 multi-feature scenarios).
- **T102** documentation (this section).
### Phase 4.5 / 5 backlog
New follow-ups discovered during Phase 4 reviews and execution. None are blocking; pick up at any time.
#### From T88 review
- **`embeddings` FK lacks `ON DELETE CASCADE`**: deindex events are the only deletion path; if memories ever get deleted directly (raw SQL), embedding rows orphan. Defensible since projector model uses explicit deindex events, but worth a comment or `ON DELETE CASCADE` addition.
#### From T89 review
- **`list_branches(chat_id=...)` filter leaks global branches** (`chat_id IS NULL`) into every chat scope. Intentional? Document.
- **Branch-switch to nonexistent silently leaves zero active branches** — log a warning when this would happen.
#### From T91 review
- **Real embedding model swap**: Phase 4 ships pseudo-embedding (deterministic SHA-256 hash). Phase 4.5+ should swap to a real model (Featherless `bge-small-en-v1.5` if available; or local `sentence-transformers/all-MiniLM-L6-v2`). The 384-dim is hardcoded in `0012_embeddings.sql`; if dim changes, migrate first.
- **`timeout_s` unused on pseudo path** — fine, but log when non-default model falls through to fallback so misconfigured callers don't silently degrade.
#### From T96 review
- **Duplicate `MAX(id)` lookup** between `_composite_rerank` and the fused-path tail — DRY follow-up.
- **`fts_rank=None` for vector-only rows** — document downstream contract.
#### From T98 review
- **`event_id <= 0` guard in `delete_turn`** — currently silently rewinds everything if `event_id` is 0. Add `if event_id <= 0: 400`.
- **`html.escape()` on `compute_delete_impact` output rendered into the modal** — defense in depth (currently model-controlled strings, but if event payload fields ever appear in descriptions, autoescape needed).
- **Extract delete-impact modal HTML to a Jinja partial** — testability + autoescape inheritance.
#### From T99 review
- **Hoist `datetime`/`timezone` imports to module level** in `chat/web/snapshots.py`.
- **`kind` defaulting in restore/preview** — reject missing `kind` rather than silent 404.
- **`created_at` from file mtime** vs filename-encoded timestamp — small drift if files copied; document.
#### From T100 review
- **Hardcoded `k=50`** — extract to module constant.
- **N+1 lookups (`get_bot`/`get_chat`/`get_scene` per row)** — fine at `k=50`, revisit if `k` grows.
- **FTS highlighting via `snippet()`** — Phase 4 skipped this; UX nice-to-have.
- **Result links chat-level only** — `memories` table has no `event_id` column; deep-linking to specific turn requires schema addition.
#### Deferred items
- **sqlite-vec swap** when host Python supports `enable_load_extension`.
- **Real embedding model** with proper semantic similarity.
- **Branching read-side filter**: T89 ships data-model + UI but event readers don't yet consult `is_active`. Each branch is metadata-only labeled ranges. Consult-on-read is Phase 4.5+ work.
- **Bulk significance re-rate** in drawer (T98.2 deferred — only per-memory edit shipped).
- **Vector index optimization** (HNSW) — only relevant if memory counts grow past pure-Python feasibility.
- **`scene-close-on-cancel` UX revisit** (Phase 2.5 carry-over).
- **Cross-feature canned-queue brittleness fixture builder** (Phase 3 carry-over).
- **Full lifecycle-rollback in regenerate** — Phase 3.5 T83.4 shipped a warning log; proper rollback needs schema-level back-references (`triggered_by_assistant_turn_id` payload field).
@@ -520,6 +520,8 @@ Written per witness when a scene closes. Different details, different interpreta
### Phase 4 — polish
**Status: shipped 2026-04-27** (T88T102, 15 tasks across 8 waves; +70 tests). See "Phase 4 status" in CLAUDE.md for the per-task breakdown. Vector retrieval shipped via pure-Python cosine over a JSON-blob embeddings table (sqlite-vec deferred — host Python lacks loadable extensions); branching is data-model + drawer UI; significance review, hide-from-view soft delete, surgical delete with cascade preview, snapshot UX, and cross-chat search all surface from the drawer or top-bar.
- Vector retrieval (sqlite-vss or sqlite-vec).
- Branching UI.
- Drawer-edit on every field.