From b6119879e593bc5bdd8873e2dddb24aa571037bc Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Mon, 27 Apr 2026 03:56:45 -0400 Subject: [PATCH] docs: phase 4 status, behavioral defaults, deferred items (T102) --- CLAUDE.md | 85 +++++++++++++++++++ .../2026-04-26-v1-requirements-design.md | 2 + 2 files changed, 87 insertions(+) diff --git a/CLAUDE.md b/CLAUDE.md index 8d80cd5..ab0a5dc 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -287,3 +287,88 @@ New follow-ups discovered during Phase 3.5 reviews and execution. None are block - **Scene-close-on-cancel UX revisit** (Phase 2.5 carry-over): T74.3 pinned the existing behavior; revisit if real play-testing surfaces a regression. - **Cross-feature canned-queue brittleness**: meanwhile-scene close test required a canned response for T65's digest call after T64+T65 merge. Future close-path additions will keep extending the queue. Consider a structured fixture builder rather than positional canned arrays. NOT addressed in Phase 3.5. - **Lifecycle-transition rollback in regenerate**: T83.4 added a warning log; actual rollback (with proper schema linkage from lifecycle event back to producing turn) is Phase 4 work. + +## Phase 4 status + +Phase 4 polish shipped end-to-end across 15 tasks (T88–T102). Vector retrieval is functional via pure-Python cosine over a JSON-blob embeddings table (sqlite-vec deferred — host Python lacks loadable extensions). Branching is data-model + drawer UI. Surgical delete with cascade preview, hide-from-view soft delete, significance review panel, snapshot UX, and cross-chat search all surface from the drawer or top-bar. Test count grew from 343 (Phase 3.5) to ~413 (+70 new tests). + +- **Wave 1 — schema + Phase 3.6 carry-overs (parallel)**: + - **T88** `embeddings` table + projector handlers (pure-Python cosine, JSON-blob storage; sqlite-vec deferred). + - **T89** `branches` table + handlers (main bootstrapped; `is_active` flag; partial unique index). + - **T90** Phase 3.6 carry-overs trio — `read_recent_dialogue` chat-id SQL pushdown, lifecycle warning wording tightening, legacy `record_turn_memory` removed. +- **Wave 2 — services (parallel)**: + - **T91** embedding generation service (Phase 4 ships a deterministic SHA-256-derived pseudo-embedding; real model swap is Phase 4.5+). + - **T92** vector search service via pure-Python cosine. + - **T93** cross-chat search service (FTS5 across all owners, no witness filter — admin-style). +- **Wave 3 — services (parallel)**: + - **T94** branching service (`branch_from_event`, `switch_active_branch`, `list_branches_with_metadata`). + - **T95** delete-impact computation service (cascade preview, no DB mutation). +- **Wave 4 — combined retrieval (single)**: + - **T96** combined FTS + vector retrieval ranking via reciprocal-rank fusion (RRF, `RRF_CONST=60`); existing significance/recency boost applied as final pass. +- **Wave 5 — memory write hook + backfill (single)**: + - **T97** `EmbeddingWorker` drains queue and emits `embedding_indexed` events; `memory_write` enqueues per `memory_written`; `backfill_embeddings` script for existing memories; ALL 4 production call sites wired (turns, regenerate, meanwhile, drawer). +- **Wave 6 — drawer Phase 4 bundle (single, 5 sub-features)**: + - **T98.1** branching UI (Branches panel + 3 routes). + - **T98.2** significance review panel (distribution bar chart + per-memory edit). + - **T98.3** hide-from-view toggle + `turn_hidden` `manual_edit` branch. + - **T98.4** surgical delete with cascade preview (reuses existing rewind path; pre-rewind snapshot preserved). + - **T98.5** remaining v1 edits — `narrative_anchor` + weather drawer affordances + 2 new `manual_edit` branches. +- **Wave 7 — UX surfaces (parallel)**: + - **T99** snapshot UX (manual trigger, list, restore with hard-confirm, preview). + - **T100** cross-chat search UX (top-bar form + results page). +- **Wave 8 — polish (parallel)**: + - **T101** cross-feature integration tests (5 multi-feature scenarios). + - **T102** documentation (this section). + +### Phase 4.5 / 5 backlog + +New follow-ups discovered during Phase 4 reviews and execution. None are blocking; pick up at any time. + +#### From T88 review + +- **`embeddings` FK lacks `ON DELETE CASCADE`**: deindex events are the only deletion path; if memories ever get deleted directly (raw SQL), embedding rows orphan. Defensible since projector model uses explicit deindex events, but worth a comment or `ON DELETE CASCADE` addition. + +#### From T89 review + +- **`list_branches(chat_id=...)` filter leaks global branches** (`chat_id IS NULL`) into every chat scope. Intentional? Document. +- **Branch-switch to nonexistent silently leaves zero active branches** — log a warning when this would happen. + +#### From T91 review + +- **Real embedding model swap**: Phase 4 ships pseudo-embedding (deterministic SHA-256 hash). Phase 4.5+ should swap to a real model (Featherless `bge-small-en-v1.5` if available; or local `sentence-transformers/all-MiniLM-L6-v2`). The 384-dim is hardcoded in `0012_embeddings.sql`; if dim changes, migrate first. +- **`timeout_s` unused on pseudo path** — fine, but log when non-default model falls through to fallback so misconfigured callers don't silently degrade. + +#### From T96 review + +- **Duplicate `MAX(id)` lookup** between `_composite_rerank` and the fused-path tail — DRY follow-up. +- **`fts_rank=None` for vector-only rows** — document downstream contract. + +#### From T98 review + +- **`event_id <= 0` guard in `delete_turn`** — currently silently rewinds everything if `event_id` is 0. Add `if event_id <= 0: 400`. +- **`html.escape()` on `compute_delete_impact` output rendered into the modal** — defense in depth (currently model-controlled strings, but if event payload fields ever appear in descriptions, autoescape needed). +- **Extract delete-impact modal HTML to a Jinja partial** — testability + autoescape inheritance. + +#### From T99 review + +- **Hoist `datetime`/`timezone` imports to module level** in `chat/web/snapshots.py`. +- **`kind` defaulting in restore/preview** — reject missing `kind` rather than silent 404. +- **`created_at` from file mtime** vs filename-encoded timestamp — small drift if files copied; document. + +#### From T100 review + +- **Hardcoded `k=50`** — extract to module constant. +- **N+1 lookups (`get_bot`/`get_chat`/`get_scene` per row)** — fine at `k=50`, revisit if `k` grows. +- **FTS highlighting via `snippet()`** — Phase 4 skipped this; UX nice-to-have. +- **Result links chat-level only** — `memories` table has no `event_id` column; deep-linking to specific turn requires schema addition. + +#### Deferred items + +- **sqlite-vec swap** when host Python supports `enable_load_extension`. +- **Real embedding model** with proper semantic similarity. +- **Branching read-side filter**: T89 ships data-model + UI but event readers don't yet consult `is_active`. Each branch is metadata-only labeled ranges. Consult-on-read is Phase 4.5+ work. +- **Bulk significance re-rate** in drawer (T98.2 deferred — only per-memory edit shipped). +- **Vector index optimization** (HNSW) — only relevant if memory counts grow past pure-Python feasibility. +- **`scene-close-on-cancel` UX revisit** (Phase 2.5 carry-over). +- **Cross-feature canned-queue brittleness fixture builder** (Phase 3 carry-over). +- **Full lifecycle-rollback in regenerate** — Phase 3.5 T83.4 shipped a warning log; proper rollback needs schema-level back-references (`triggered_by_assistant_turn_id` payload field). diff --git a/docs/plans/2026-04-26-v1-requirements-design.md b/docs/plans/2026-04-26-v1-requirements-design.md index f84b2cb..5db1623 100644 --- a/docs/plans/2026-04-26-v1-requirements-design.md +++ b/docs/plans/2026-04-26-v1-requirements-design.md @@ -520,6 +520,8 @@ Written per witness when a scene closes. Different details, different interpreta ### Phase 4 — polish +**Status: shipped 2026-04-27** (T88–T102, 15 tasks across 8 waves; +70 tests). See "Phase 4 status" in CLAUDE.md for the per-task breakdown. Vector retrieval shipped via pure-Python cosine over a JSON-blob embeddings table (sqlite-vec deferred — host Python lacks loadable extensions); branching is data-model + drawer UI; significance review, hide-from-view soft delete, surgical delete with cascade preview, snapshot UX, and cross-chat search all surface from the drawer or top-bar. + - Vector retrieval (sqlite-vss or sqlite-vec). - Branching UI. - Drawer-edit on every field.