merge: T118 phase 4.5 docs sweep — Phase 4.5 status + Phase 5 backlog

merge: T117 phase 4.5 cross-feature integration tests
merge: T116 CannedQueue test fixture builder
2026-04-27 07:04:01 -04:00 · 2026-04-27 07:04:01 -04:00 · 2026-04-27 07:04:01 -04:00 · 2026-04-27 07:03:56 -04:00 · 2026-04-27 07:03:20 -04:00 · 2026-04-27 06:56:20 -04:00
46 changed files with 4645 additions and 207 deletions
@@ -322,53 +322,48 @@ Phase 4 polish shipped end-to-end across 15 tasks (T88–T102). Vector retrieval
 ### Phase 4.5 / 5 backlog
-New follow-ups discovered during Phase 4 reviews and execution. None are blocking; pick up at any time.
+All items shipped or deferred to Phase 5 (see "Phase 5 backlog" below). Final schema version: 14.
-#### From T88 review
+## Phase 4.5 status
- **`embeddings` FK lacks `ON DELETE CASCADE`**: deindex events are the only deletion path; if memories ever get deleted directly (raw SQL), embedding rows orphan. Defensible since projector model uses explicit deindex events, but worth a comment or `ON DELETE CASCADE` addition.
+Phase 4.5 cleanup shipped 13 of 14 planned tasks (T103–T117 with T115 deferred; T118 is this docs sweep). Two CLAUDE.md backlogs (Phase 3.6/4, Phase 4.5/5) are now empty; deferred follow-ups discovered during execution are tracked in a new "Phase 5 backlog" section below. Schema baseline advanced from version 13 to **14** (migration 0014: `memories.event_id`). Test count grew from ~413 (Phase 4) to ~457 (+~44 new tests across the wave).
-#### From T89 review
+- **Wave 1 — trivial polish (parallel)**:
  - **T103** branches polish — global-branch (`chat_id IS NULL`) leak documented in `list_branches`; branch-switch to nonexistent name now logs a warning.
  - **T104** `memory.py` DRY — `MAX(id)` helper extracted; `fts_rank=None` contract documented for vector-only rows.
  - **T105** `snapshots.py` polish — `datetime`/`timezone` imports hoisted to module level; strict `kind` validation in restore/preview (rejects missing); `created_at` from file mtime documented.
  - **T106** `search.py` polish — `k=50` extracted to module constant; N+1 `get_bot`/`get_chat`/`get_scene` lookups batched.
  - **T107** `embeddings.py` — `timeout_s` fallback-path warning when non-default model misconfigured.
 - **Wave 2 — scene-close-on-cancel (single)**:
  - **T108** strengthened the T74.3 regression test + documented rationale in `turns.py`. **Surfaced a deferred bug**: existing pin only passes because `asyncio` isn't imported in the test module (NameError caught instead of CancelledError). When CancelledError fires for real, `post_turn`'s end-of-function re-raise causes `open_db`'s dependency teardown to skip `conn.commit()`, rolling back ALL post-cancel writes. Documented and deferred to Phase 5 triage.
 - **Wave 3 — schema 0014 (single)**:
  - **T109** `memories.event_id` column (foundation for T111 deep-link). FK CASCADE on `embeddings.memory_id` deferred (memories rows are never deleted today; defensive constraint can't fire — saved for broader migration cleanup in Phase 5).
 - **Wave 4 — drawer Phase 4.5 bundle (single)**:
  - **T110** `event_id <= 0` guard in `delete_turn` + `html.escape()` on delete-impact modal + Jinja partial extraction + bulk significance re-rate per chat (one `manual_edit` event per memory).
 - **Wave 5 — search UX (single)**:
  - **T111** FTS snippet highlighting via `snippet()` + deep-link to turn via `memories.event_id`.
 - **Wave 6 — real embedding model swap (single)**:
  - **T112** `LLMClient.embed()` Protocol + Mock impl with `canned_embeddings` + `FeatherlessClient.embed()` (raises `NotImplementedError` — Featherless OAI-compat doesn't expose embeddings, gap documented) + `generate_embedding` routes non-default models through `client.embed()` with fallback + `--re-embed-all` backfill flag.
 - **Wave 7 — branching read-side filter (single)**:
  - **T113** `active_branch_event_ids(conn)` helper + applied to `read_recent_dialogue`, `scene_summarize._read_recent_dialogue`, `search_memories`, and `meanwhile._read_recent_meanwhile_dialogue`. Cross-chat search and projector queries deliberately NOT filtered (cross-chat is by design; projectors must see full log). Bootstrap "main" branch (origin=0, head=0) detected as the no-clamp sentinel.
 - **Wave 8 — regenerate lifecycle rollback (single)**:
  - **T114** `triggered_by_assistant_turn_id` payload back-reference on `event_started`/`event_completed`/`event_cancelled` + new `event_status_reverted` event kind + projector handler in `chat/state/events.py` + regenerate flow emits revert events for affected lifecycle transitions.
 - **Wave 9 — final polish + integration (parallel)**:
  - **T115** sqlite-vec swap — **DEFERRED to Phase 5**. Pre-flight failed: host Python build doesn't expose `sqlite3.Connection.enable_load_extension` (raises `AttributeError`). Requires either Python rebuild with `--enable-loadable-sqlite-extensions` or migration to `apsw`. Phase 4 pure-Python cosine remains in production.
  - **T116** structured `CannedQueue` test fixture builder + 2–3 POC test migrations (Phase 5 to migrate the rest).
  - **T117** Phase 4.5 cross-feature integration tests (5 minimum: real embedding swap, branching read-side filter, lifecycle rollback, search deep-link, bulk significance re-rate).
  - **T118** documentation (this section).
- **`list_branches(chat_id=...)` filter leaks global branches** (`chat_id IS NULL`) into every chat scope. Intentional? Document.
+### Phase 5 backlog
 - **Branch-switch to nonexistent silently leaves zero active branches** — log a warning when this would happen.
-#### From T91 review
+New follow-ups discovered during Phase 4.5 reviews and execution, plus carry-over deferrals. None are blocking; pick up at any time.
- **Real embedding model swap**: Phase 4 ships pseudo-embedding (deterministic SHA-256 hash). Phase 4.5+ should swap to a real model (Featherless `bge-small-en-v1.5` if available; or local `sentence-transformers/all-MiniLM-L6-v2`). The 384-dim is hardcoded in `0012_embeddings.sql`; if dim changes, migrate first.
+- **T115 sqlite-vec swap** (environmental blocker): host Python's `sqlite3.Connection` does not expose `enable_load_extension` — `python -c "import sqlite3; sqlite3.connect(':memory:').enable_load_extension(True)"` raises `AttributeError`. Fix requires either a Python rebuild with `--enable-loadable-sqlite-extensions` or migration to `apsw`. Pure-Python cosine remains in production until then.
- **`timeout_s` unused on pseudo path** — fine, but log when non-default model falls through to fallback so misconfigured callers don't silently degrade.
+- **T108 follow-up: cancel-path commit bug** — `post_turn`'s re-raised `CancelledError` causes `open_db` dependency teardown to skip `conn.commit()`, rolling back all post-cancel writes. The existing T74.3 regression test passes only because `asyncio` isn't imported in the test module (NameError masks the real cancel path). Triage required — either commit before re-raise, or restructure the route to never re-raise after the close-detection branch.
-
+- **`embeddings` FK CASCADE on `memory_id`** — deferred from T109; do as part of a broader migration consolidation in Phase 5.
-#### From T96 review
+- **`CannedQueue` fixture migration** — T116 shipped the builder + POC migrations; remaining tests still use positional canned arrays. Migrate in Phase 5.
-
+- **Vector index optimization (HNSW)** — currently scales to a few thousand memories on the flat-index pure-Python cosine path; revisit when counts grow past flat-index feasibility.
- **Duplicate `MAX(id)` lookup** between `_composite_rerank` and the fused-path tail — DRY follow-up.
+- **Branch-isolated `event_log`** — each branch has its own physical `event_log` range vs the current shared id space + head filter; full branch isolation is Phase 5+.
- **`fts_rank=None` for vector-only rows** — document downstream contract.
+- **Embedding model swap migration tooling** — T112 added `--re-embed-all`; a more orchestrated swap (drain old worker, re-seed all memories, swap config) is Phase 5+.
-
+- **Real-time collaborative branching** (multi-user) — out of scope for v1.
-#### From T98 review
+- **Avatars / portraits** (multimodality) — deferred indefinitely per design §14.
 - **`event_id <= 0` guard in `delete_turn`** — currently silently rewinds everything if `event_id` is 0. Add `if event_id <= 0: 400`.
 - **`html.escape()` on `compute_delete_impact` output rendered into the modal** — defense in depth (currently model-controlled strings, but if event payload fields ever appear in descriptions, autoescape needed).
 - **Extract delete-impact modal HTML to a Jinja partial** — testability + autoescape inheritance.
 #### From T99 review
 - **Hoist `datetime`/`timezone` imports to module level** in `chat/web/snapshots.py`.
 - **`kind` defaulting in restore/preview** — reject missing `kind` rather than silent 404.
 - **`created_at` from file mtime** vs filename-encoded timestamp — small drift if files copied; document.
 #### From T100 review
 - **Hardcoded `k=50`** — extract to module constant.
 - **N+1 lookups (`get_bot`/`get_chat`/`get_scene` per row)** — fine at `k=50`, revisit if `k` grows.
 - **FTS highlighting via `snippet()`** — Phase 4 skipped this; UX nice-to-have.
 - **Result links chat-level only** — `memories` table has no `event_id` column; deep-linking to specific turn requires schema addition.
 #### Deferred items
 - **sqlite-vec swap** when host Python supports `enable_load_extension`.
 - **Real embedding model** with proper semantic similarity.
 - **Branching read-side filter**: T89 ships data-model + UI but event readers don't yet consult `is_active`. Each branch is metadata-only labeled ranges. Consult-on-read is Phase 4.5+ work.
 - **Bulk significance re-rate** in drawer (T98.2 deferred — only per-memory edit shipped).
 - **Vector index optimization** (HNSW) — only relevant if memory counts grow past pure-Python feasibility.
 - **`scene-close-on-cancel` UX revisit** (Phase 2.5 carry-over).
 - **Cross-feature canned-queue brittleness fixture builder** (Phase 3 carry-over).
 - **Full lifecycle-rollback in regenerate** — Phase 3.5 T83.4 shipped a warning log; proper rollback needs schema-level back-references (`triggered_by_assistant_turn_id` payload field).
@@ -94,9 +94,15 @@ async def lifespan(app: FastAPI):
    # Phase 4's pseudo-embedding path is local so the worker doesn't need
    # an LLM client; we still pass one so the Phase 4.5 swap to a real
    # model is a one-line change.
    # T112 (Phase 4.5): the embedding model is now configurable via
    # ``Settings.embedding_model``. Default ``"pseudo-sha256-384"``
    # keeps the local-only path; swapping to a real model routes
    # through ``client.embed(...)`` and falls back to a zero vector
    # plus warning if the provider doesn't support embeddings.
    embedding_worker = EmbeddingWorker(
        conn_factory=lambda: open_db(settings.db_path),
        client=_factory(),
        model=settings.embedding_model,
    )
    await embedding_worker.start()
    app.state.embedding_worker = embedding_worker
@@ -39,6 +39,14 @@ class Settings(BaseModel):
    data_dir: Path = REPO_ROOT / "data"
    bind_host: str = "127.0.0.1"
    bind_port: int = 8000
    # T112 (Phase 4.5): embedding model identifier. Default is the
    # deterministic local pseudo (semantically meaningless but keeps the
    # vector pipeline structurally valid). Swap to a real model name
    # (e.g. "bge-small-en-v1.5") once the LLMClient implementation
    # supports embed() — currently FeatherlessClient does NOT, so a
    # non-default value will trigger the zero-vector fallback path
    # plus a T107 warning until a different provider is wired in.
    embedding_model: str = "pseudo-sha256-384"
 def load_settings() -> Settings:
    config_path = Path(os.environ.get("CHAT_CONFIG_PATH", DEFAULT_CONFIG))
@@ -0,0 +1,25 @@
 -- 0014_phase45_schema.sql — Phase 4.5 Wave 2 schema bump (T109).
 --
 -- Two schema concerns are bundled into this migration:
 --
 -- 1. ``embeddings.memory_id`` FK should ideally carry ``ON DELETE CASCADE``
 --    (T88 review nit). DEFERRED to Phase 5: ``embeddings`` rows are only ever
 --    deleted when the parent ``memories`` row is deleted, and ``memories``
 --    rows are never deleted today (memory hide is a soft flag; the surgical
 --    ``deindex_event`` path operates on ``event_log`` and does NOT cascade
 --    to projection rows). The CASCADE constraint therefore can't fire under
 --    current usage — adding the SQLite table-rebuild dance (rename, recreate,
 --    copy, drop, reindex) for a defensive constraint is unwarranted bloat
 --    in a polish wave. Revisit during the broader Phase 5 migration cleanup
 --    when other table reshapes make the rebuild worthwhile.
 --
 -- 2. Add ``memories.event_id`` (NULLABLE INTEGER, references ``event_log.id``)
 --    so cross-chat search results can deep-link back to the originating
 --    turn (foundation for T111). The column is nullable so historical
 --    memory rows projected before 0014 ran continue to round-trip cleanly;
 --    new rows are populated by the ``memory_written`` projector handler
 --    from the projecting event's id. This is a pure additive change — no
 --    backfill is performed. Older rows simply read NULL until/unless a
 --    later migration backfills them; T111 surfaces are coded to accept
 --    NULL gracefully (no deep-link rendered).
 ALTER TABLE memories ADD COLUMN event_id INTEGER REFERENCES event_log(id);
@@ -12,3 +12,11 @@ class Message:
 class LLMClient(Protocol):
    async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str: ...
    def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]: ...
    # T112 (Phase 4.5): real-embedding seam. Implementations either call a
    # provider's ``/v1/embeddings`` endpoint or, when the provider doesn't
    # expose embeddings (e.g. Featherless today), raise ``NotImplementedError``
    # so ``generate_embedding`` can catch it and degrade to the zero-vector
    # fallback. The Protocol is structural, so this method only needs to
    # exist on implementations; existing callers that don't use it are
    # unaffected.
    async def embed(self, text: str, *, model: str) -> list[float]: ...
@@ -53,3 +53,26 @@ class FeatherlessClient:
                delta = chunk.choices[0].delta.content or ""
                if delta:
                    yield delta
    async def embed(self, text: str, *, model: str) -> list[float]:
        """Embeddings via Featherless — currently unsupported.
        T112 (Phase 4.5) extends the LLMClient Protocol with ``embed()``
        for a future real-embedding swap. Featherless's OpenAI-compatible
        surface does NOT expose ``/v1/embeddings`` at the time of writing,
        so this implementation raises ``NotImplementedError`` rather than
        attempting a request that would 404. The
        :func:`chat.services.embeddings.generate_embedding` wrapper
        catches this and degrades to the existing zero-vector fallback
        (with the T107 warning), so misconfigured callers fail loudly in
        logs but the request path keeps working.
        If Featherless ships embeddings, swap the body for an
        ``self._client.embeddings.create(model=..., input=...)`` call
        guarded by ``self._sem()`` (mirrors ``generate``/``stream``).
        """
        raise NotImplementedError(
            "Featherless does not expose /v1/embeddings; "
            "configure a different embedding provider or stick with "
            "the default pseudo-sha256-384 model."
        )
@@ -4,8 +4,23 @@ from .client import Message
 class MockLLMClient:
-    def __init__(self, canned: list[str]):
+    """In-memory LLMClient for tests.
    ``canned`` feeds ``generate``/``stream`` (one entry per call, popped
    from the front). ``canned_embeddings`` (T112, Phase 4.5) feeds
    ``embed`` the same way — each call pops the next vector. An empty
    queue raises ``IndexError`` so misconfigured tests fail loudly
    rather than returning ``None`` or hanging.
    """
    def __init__(
        self,
        canned: list[str],
        *,
        canned_embeddings: list[list[float]] | None = None,
    ):
        self._canned = list(canned)
        self._canned_embeddings: list[list[float]] = list(canned_embeddings or [])
    async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str:
        return self._canned.pop(0)
@@ -14,3 +29,8 @@ class MockLLMClient:
        text = self._canned.pop(0)
        for ch in text:
            yield ch
    async def embed(self, text: str, *, model: str) -> list[float]:
        # Mirrors the canned-queue pattern; empty queue raises so
        # misconfigured tests surface clearly instead of returning None.
        return self._canned_embeddings.pop(0)
@@ -26,13 +26,28 @@ def search_all_memories(
    """Search FTS5 across all owners and chats.
    Returns rows with ``{memory_id, owner_id, chat_id, scene_id,
-    pov_summary, significance, ts, fts_rank}``, sorted by FTS5 BM25
+    event_id, pov_summary, snippet, significance, ts, fts_rank}``,
-    rank ascending (lower rank = stronger match, surfaced first).
+    sorted by FTS5 BM25 rank ascending (lower rank = stronger match,
    surfaced first).
    ``event_id`` (T111.2 / T109) is the id of the ``event_log`` row that
    drove the projecting ``memory_written`` event. May be ``None`` for
    memory rows projected before the 0014 schema migration ran (the
    column is nullable on purpose; T109 did not backfill historical
    rows). The search-results UI uses it to deep-link to the originating
    turn anchor (Phase 3.5 T86 stamps ``id="turn-{event_id}"`` on each
    turn DOM node) and falls back to a chat-level link when ``None``.
    The ``memories`` table has no ``ts`` column; we expose ``created_at``
    (the projector-side row insertion timestamp) under that key so the
    UI does not have to know the storage name.
    ``snippet`` (T111.1) is the FTS5 ``snippet()`` output for the
    matched ``pov_summary`` column: a windowed excerpt with each match
    token wrapped in ``<mark>...</mark>`` for the search-results UI to
    render verbatim. The full ``pov_summary`` is also returned so
    non-highlighted callers (or fallbacks) keep the original string.
    An empty / whitespace-only ``query`` short-circuits to ``[]`` to
    avoid an FTS5 ``MATCH ''`` syntax error and to keep the top-bar
    "no input yet" state from triggering a full-table scan.
@@ -45,9 +60,20 @@ def search_all_memories(
    # from the content table because the FTS index only stores
    # ``pov_summary``. ORDER BY rank ASC because BM25 in FTS5 returns
    # negative scores where lower is better.
    #
    # ``snippet(memories_fts, 0, ...)`` (T111.1) targets column 0 of the
    # FTS virtual table, which is ``pov_summary`` (the only column
    # indexed by ``CREATE VIRTUAL TABLE memories_fts USING fts5(
    # pov_summary, ...)`` in migration 0006). SQLite passes the raw
    # column text through verbatim aside from inserting the configured
    # before/after match markers, so the only HTML in the output is the
    # ``<mark>`` we injected — safe to render with ``|safe`` server-side.
    rows = conn.execute(
-        "SELECT m.id, m.owner_id, m.chat_id, m.scene_id, "
+        "SELECT m.id, m.owner_id, m.chat_id, m.scene_id, m.event_id, "
-        "       m.pov_summary, m.significance, m.created_at, "
+        "       m.pov_summary, "
        "       snippet(memories_fts, 0, '<mark>', '</mark>', '…', 32) "
        "       AS snippet, "
        "       m.significance, m.created_at, "
        "       memories_fts.rank "
        "FROM memories_fts "
        "JOIN memories m ON m.id = memories_fts.rowid "
@@ -63,10 +89,12 @@ def search_all_memories(
            "owner_id": r[1],
            "chat_id": r[2],
            "scene_id": r[3],
-            "pov_summary": r[4],
+            "event_id": r[4],
-            "significance": r[5],
+            "pov_summary": r[5],
-            "ts": r[6],
+            "snippet": r[6],
-            "fts_rank": r[7],
+            "significance": r[7],
            "ts": r[8],
            "fts_rank": r[9],
        }
        for r in rows
    ]
@@ -10,6 +10,7 @@ EmbeddingResult shape stays the same, only the generator changes.
 from __future__ import annotations
 import hashlib
 import logging
 import math
 import struct
@@ -18,6 +19,8 @@ from pydantic import BaseModel
 from chat.llm.client import LLMClient
 _log = logging.getLogger(__name__)
 DEFAULT_EMBEDDING_DIM = 384
 DEFAULT_EMBEDDING_MODEL = "pseudo-sha256-384"
 FALLBACK_EMBEDDING_MODEL = "fallback"
@@ -92,11 +95,27 @@ async def generate_embedding(
        # Pure-local pseudo path — no LLMClient call.
        return EmbeddingResult(vector=_pseudo_embed(text, dim), model=model, dim=dim)
-    # Future: real embedding via client.embed(...). Phase 4.5 work.
+    # T112 (Phase 4.5): non-default model — route through the client's
-    # For Phase 4, any non-default model falls through to fallback.
+    # ``embed()`` method. On any failure (including ``NotImplementedError``
-    return EmbeddingResult(
+    # from providers that don't expose embeddings, e.g. Featherless today),
-        vector=[0.0] * dim, model=FALLBACK_EMBEDDING_MODEL, dim=dim
+    # fall back to the zero vector and re-fire the T107 warning so
-    )
+    # misconfigured callers see the issue in logs rather than silently
    # producing useless cosine results.
    try:
        vector = await client.embed(text, model=model)
        return EmbeddingResult(vector=list(vector), model=model, dim=len(vector))
    except Exception as exc:  # noqa: BLE001 — any failure must degrade gracefully
        _log.warning(
            "generate_embedding: non-default model %r returned fallback "
            "(client.embed() raised %s: %s); "
            "downstream search will degrade silently. Configure a supported model.",
            model,
            type(exc).__name__,
            exc,
        )
        return EmbeddingResult(
            vector=[0.0] * dim, model=FALLBACK_EMBEDDING_MODEL, dim=dim
        )
 __all__ = [
@@ -95,6 +95,27 @@ from chat.web.render import render_turn_html
 _log = logging.getLogger(__name__)
 # T114.3: map a lifecycle-transition event kind to the events-table
 # status it implicitly transitioned *from*. Regenerate uses this to pick
 # the ``prior_status`` value for the ``event_status_reverted`` rollback
 # event so the projector sets the row back to where it was before the
 # superseded turn fired the transition.
 #
 # - ``event_started`` was emitted when the row was 'planned' → revert to
 #   'planned'.
 # - ``event_completed`` was emitted when the row was 'active' → revert
 #   to 'active'.
 # - ``event_cancelled`` could have fired from either 'planned' or
 #   'active'. Best-effort default: 'active'. The forward transitions
 #   below only fire detect_event_transitions for currently-active rows,
 #   so 'active' is the realistic prior in practice.
 _PRIOR_STATUS_MAP: dict[str, str] = {
    "event_started": "planned",
    "event_completed": "active",
    "event_cancelled": "active",
 }
 async def regenerate_assistant_turn(
    conn: Connection,
    client,
@@ -115,17 +136,18 @@ async def regenerate_assistant_turn(
    cannot be found — the FastAPI route translates this to 404.
    .. note::
-       **Lifecycle-rollback limitation (T83.4, Phase 4 follow-up).**
+       **Lifecycle rollback (T114, Phase 4.5).**
       When the superseded turn already produced lifecycle transitions
       (``event_started`` / ``event_completed`` / ``event_cancelled``),
-       this function does NOT roll those rows back before re-running
+       this function emits an ``event_status_reverted`` event for each
-       ``detect_event_transitions`` against the regenerated text. A
+       so the events row's status returns to its prior value before the
-       regenerate-after-completion can therefore double-emit promotion
+       regenerated narrative is reclassified. Backward compatibility:
-       artifacts if the new text re-completes the same event. Phase 3.5
+       lifecycle events authored before T114.1 lack the
-       only documents the gap and emits a WARNING log naming the
+       ``triggered_by_assistant_turn_id`` payload field; rollback skips
-       affected event_log ids; the actual undo pass is invasive
+       those (logged at DEBUG) so historic rows are not retroactively
-       (re-projection / inverse-handler dispatch) and is deferred to
+       reverted. A WARNING about un-rolled-back transitions is still
-       Phase 4. See the ``# T83.4`` block below for the warning emit.
+       emitted when stragglers are found — the rollback handles the
       common case while older logs continue to need manual review.
    """
    chat = get_chat(conn, chat_id)
    if chat is None:
@@ -158,20 +180,21 @@ async def regenerate_assistant_turn(
    original_assistant_payload = json.loads(row[0])
    original_user_turn_id = original_assistant_payload.get("user_turn_id")
-    # T83.4: scan for downstream lifecycle transitions emitted by the
+    # T114.3: roll back lifecycle transitions emitted by the superseded
-    # superseded turn — they're not being rolled back (see method
+    # turn. The scan uses the same id-greater-than-superseded-turn
-    # docstring). Heuristic: any ``event_started`` / ``event_completed``
+    # heuristic as the legacy T83.4 warning, joined to ``events`` for
-    # / ``event_cancelled`` event_log row with id strictly greater than
+    # chat scoping (lifecycle events don't carry chat_id in their
-    # the original assistant_turn's id was emitted as part of (or after)
+    # payload — they reference an ``event_id`` FK to the ``events``
-    # that turn's processing. Lifecycle events don't carry ``chat_id``
+    # table, which holds chat_id). For each row whose payload carries
-    # in their payload (their payload references an ``event_id`` FK to
+    # ``triggered_by_assistant_turn_id == original_assistant_event_id``
-    # the ``events`` table, which holds chat_id), so we join through
+    # (T114.1 back-reference), emit an ``event_status_reverted`` event
-    # ``events`` to scope to this chat.
+    # so the events-row status returns to the pre-transition value.
-    #
+    # Lifecycle rows authored before T114.1 lack the back-reference;
-    # A WARNING log surfaces the affected event ids so operators can
+    # those are skipped (DEBUG log) and a WARNING tracks their count so
-    # spot double-emit cases until the Phase 4 rollback pass lands.
+    # operators still see legacy stragglers — preserves the T83.4
    # observability contract for un-rolled-back transitions.
    unrolled_lifecycle = conn.execute(
-        "SELECT el.id, el.kind FROM event_log AS el "
+        "SELECT el.id, el.kind, el.payload_json FROM event_log AS el "
        "JOIN events AS ev "
        "  ON ev.event_id = json_extract(el.payload_json, '$.event_id') "
        "WHERE el.kind IN ("
@@ -182,18 +205,73 @@ async def regenerate_assistant_turn(
        "ORDER BY el.id ASC",
        (chat_id, original_assistant_event_id),
    ).fetchall()
-    if unrolled_lifecycle:
+    rolled_back_ids: list[int] = []
-        # T90.2: phrased as "at-or-after turn <id>" rather than "from
+    skipped_no_backref: list[int] = []
-        # superseded turn" because regenerating an OLDER turn lists
+    for el_id, el_kind, el_payload_json in unrolled_lifecycle:
-        # intervening-turn transitions that legitimately stand on their
+        try:
-        # own — those weren't authored by the superseded turn itself.
+            lifecycle_payload = json.loads(el_payload_json)
        except (TypeError, ValueError):
            skipped_no_backref.append(el_id)
            continue
        triggered_by = lifecycle_payload.get("triggered_by_assistant_turn_id")
        if triggered_by != original_assistant_event_id:
            # Either a legacy row (no field) or a transition triggered
            # by a *different* turn — leave it alone. DEBUG so the
            # message is available under verbose logging without
            # spamming the default WARNING channel.
            _log.debug(
                "regenerate_assistant_turn: skipping rollback for "
                "lifecycle event_log id=%d (kind=%s) — no back-reference "
                "or different turn (triggered_by=%r vs superseded=%d)",
                el_id,
                el_kind,
                triggered_by,
                original_assistant_event_id,
            )
            if triggered_by is None:
                skipped_no_backref.append(el_id)
            continue
        prior_status = _PRIOR_STATUS_MAP.get(el_kind)
        if prior_status is None:
            # Defensive: the SQL filter already restricts to the three
            # known kinds, but a future schema addition shouldn't crash
            # the rollback path.
            continue
        target_event_id = lifecycle_payload.get("event_id")
        if target_event_id is None:
            continue
        append_and_apply(
            conn,
            kind="event_status_reverted",
            payload={
                "event_id": target_event_id,
                "prior_status": prior_status,
            },
        )
        rolled_back_ids.append(el_id)
    if rolled_back_ids:
        _log.info(
            "regenerate_assistant_turn: rolled back %d lifecycle "
            "transition(s) triggered by superseded turn %s "
            "(event_log ids: %s)",
            len(rolled_back_ids),
            original_assistant_event_id,
            rolled_back_ids,
        )
    if skipped_no_backref:
        # T83.4 (legacy) compatibility: still warn about stragglers
        # without the back-reference so operators can spot pre-T114
        # double-emit risks. Phrased as "at-or-after turn <id>" per
        # T90.2 — older transitions may legitimately belong to other
        # turns.
        _log.warning(
            "regenerate_assistant_turn: %d lifecycle transition(s) "
-            "at-or-after turn %s are NOT being rolled back (Phase 4 "
+            "at-or-after turn %s are NOT being rolled back (no "
-            "follow-up). Affected event ids: %s",
+            "triggered_by_assistant_turn_id back-reference). "
-            len(unrolled_lifecycle),
+            "Affected event ids: %s",
            len(skipped_no_backref),
            original_assistant_event_id,
-            [r[0] for r in unrolled_lifecycle],
+            skipped_no_backref,
        )
    # 1a. Look up any sibling interjection beat in the same turn group
@@ -716,11 +794,13 @@ async def regenerate_assistant_turn(
    # runs inline after a completion so promotion artifacts land in the
    # same regenerate path.
    #
-    # T83.4 follow-up: when a regenerate replaces a turn that had
+    # T114.3: original-turn transitions emitted before this regenerate
-    # already produced event transitions, those original transitions
+    # ran were rolled back at the top of the function (see the
-    # are NOT undone here (Phase 4 work). A WARNING log earlier in this
+    # ``# T114.3`` block) by appending ``event_status_reverted`` for
-    # function names the affected event_log ids — see the T83.4 block
+    # each. The classify-and-emit pass below now operates against an
-    # near the function entry.
+    # ``events`` projection that has already been reverted, so it can
    # safely re-fire transitions for the regenerated narrative without
    # double-emitting promotion artifacts.
    new_active_events = list_active_events(conn, chat_id)
    if new_active_events:
        lifecycle_decision = await detect_event_transitions(
@@ -738,6 +818,12 @@ async def regenerate_assistant_turn(
                    payload={
                        "event_id": transition.event_id,
                        "started_at": chat.get("time"),
                        # T114.1: back-reference to the assistant_turn
                        # that triggered this transition (see turns.py
                        # for rationale).
                        "triggered_by_assistant_turn_id": (
                            new_assistant_event_id
                        ),
                    },
                )
            elif transition.new_status == "completed":
@@ -747,6 +833,10 @@ async def regenerate_assistant_turn(
                    payload={
                        "event_id": transition.event_id,
                        "completed_at": chat.get("time"),
                        # T114.1: back-reference (see above).
                        "triggered_by_assistant_turn_id": (
                            new_assistant_event_id
                        ),
                    },
                )
                promote_completed_event(
@@ -762,6 +852,10 @@ async def regenerate_assistant_turn(
                    payload={
                        "event_id": transition.event_id,
                        "completed_at": chat.get("time"),
                        # T114.1: back-reference (see above).
                        "triggered_by_assistant_turn_id": (
                            new_assistant_event_id
                        ),
                    },
                )
@@ -144,23 +144,36 @@ def _read_recent_dialogue(
    ``id >= since_event_id`` so callers needing a scene-scoped view (e.g.
    thread detection on close) don't pull turns that landed before the
    closing scene's ``scene_opened`` event.
    T113: also clamps by the active branch's ``[origin, head]`` event-id
    range so scene-summary inputs respect the user's current branch.
    Bootstrap-main and "no active branch" fall through to ``(0, BIG_INT)``
    so existing flows are unchanged.
    """
    from chat.state.branches import active_branch_event_ids
    origin, head = active_branch_event_ids(conn)
    if since_event_id is None:
        cur = conn.execute(
            "SELECT kind, payload_json FROM event_log "
            "WHERE kind IN ('user_turn', 'assistant_turn') "
            "  AND superseded_by IS NULL AND hidden = 0 "
            "  AND id BETWEEN ? AND ? "
            "ORDER BY id DESC LIMIT ?",
-            (limit,),
+            (origin, head, limit),
        )
    else:
        # Compose ``since_event_id`` with the branch lower bound — readers
        # want the tightest ``id >= max(since, origin)`` clamp without an
        # extra Python pass.
        lower = max(origin, since_event_id)
        cur = conn.execute(
            "SELECT kind, payload_json FROM event_log "
            "WHERE kind IN ('user_turn', 'assistant_turn') "
            "  AND superseded_by IS NULL AND hidden = 0 "
-            "  AND id >= ? "
+            "  AND id BETWEEN ? AND ? "
            "ORDER BY id DESC LIMIT ?",
-            (since_event_id, limit),
+            (lower, head, limit),
        )
    rows = list(reversed(cur.fetchall()))
    out: list[dict] = []
@@ -30,6 +30,7 @@ from __future__ import annotations
 import json
 from sqlite3 import Connection
 from chat.state.branches import active_branch_event_ids
 from chat.state.edges import get_edge
@@ -60,15 +61,22 @@ def read_recent_dialogue(
    previous implementation filtered chat_id post-fetch in Python, which
    let foreign-chat rows fill the LIMIT and yield fewer than N relevant
    rows in busy multi-chat databases.
    T113: clamp by the active branch's ``[origin, head]`` event-id range so
    switching branches actually changes what dialogue this read sees.
    Bootstrap-main and "no active branch" both fall through to ``(0,
    BIG_INT)`` — no functional change for the metadata-only Phase 4 era.
    """
    origin, head = active_branch_event_ids(conn)
    if exclude_event_id is None:
        cur = conn.execute(
            "SELECT id, kind, payload_json FROM event_log "
            "WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
            "  AND superseded_by IS NULL AND hidden = 0 "
            "  AND id BETWEEN ? AND ? "
            "  AND json_extract(payload_json, '$.chat_id') = ? "
            "ORDER BY id DESC LIMIT ?",
-            (chat_id, limit),
+            (origin, head, chat_id, limit),
        )
    else:
        cur = conn.execute(
@@ -76,9 +84,10 @@ def read_recent_dialogue(
            "WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
            "  AND id != ? "
            "  AND superseded_by IS NULL AND hidden = 0 "
            "  AND id BETWEEN ? AND ? "
            "  AND json_extract(payload_json, '$.chat_id') = ? "
            "ORDER BY id DESC LIMIT ?",
-            (exclude_event_id, chat_id, limit),
+            (exclude_event_id, origin, head, chat_id, limit),
        )
    rows = list(reversed(cur.fetchall()))
    out: list[dict] = []
@@ -9,11 +9,15 @@ existing event readers remain branch-agnostic.
 """
 from __future__ import annotations
 import logging
 from sqlite3 import Connection
 from chat.eventlog.projector import on
 from chat.eventlog.log import Event
 logger = logging.getLogger(__name__)
@on("branch_created")
 def _apply_branch_created(conn: Connection, e: Event) -> None:
@@ -37,9 +41,26 @@ def _apply_branch_switched(conn: Connection, e: Event) -> None:
    """Set is_active=1 on the named branch and is_active=0 on all others.
    Atomic via two UPDATEs ordered to avoid the unique-active-index race.
    If the named branch does not exist, a warning is emitted and the
    is_active flags are still cleared (preserving prior behavior — the
    second UPDATE simply matches no rows). Callers should validate the
    name upstream; this guard surfaces accidental mismatches in the log.
    """
    p = e.payload
    name = p["name"]
    # Warn (don't raise) if the target branch is missing. The existing
    # outcome — zero active branches — is preserved; this just makes the
    # condition observable instead of silent.
    exists = conn.execute(
        "SELECT 1 FROM branches WHERE name = ? LIMIT 1",
        (name,),
    ).fetchone()
    if exists is None:
        logger.warning(
            "branch_switched to unknown branch name %r; no branch will be active",
            name,
        )
    # Clear ALL is_active flags first (avoids the unique-index trip).
    conn.execute("UPDATE branches SET is_active = 0 WHERE is_active = 1")
    conn.execute(
@@ -79,6 +100,16 @@ def get_branch(conn: Connection, name: str) -> dict | None:
 def list_branches(conn: Connection, chat_id: str | None = None) -> list[dict]:
    """Return branch rows, optionally scoped to a chat.
    When ``chat_id`` is provided the filter is ``chat_id = ? OR chat_id IS NULL``,
    so global (null-chat) branches are returned in *every* per-chat scope. This
    is intentional: the bootstrapped ``"main"`` branch (and any future
    null-chat branches) are global by design — they belong to no single chat
    and should appear alongside per-chat branches in any chat-scoped listing.
    Callers that want only per-chat branches should filter the result on
    ``chat_id is not None``.
    """
    if chat_id is None:
        rows = conn.execute(
            "SELECT id, name, origin_event_id, head_event_id, chat_id, "
@@ -126,8 +157,58 @@ def active_branch(conn: Connection) -> dict | None:
    }
 # T113: sentinel "no upper bound" used by ``active_branch_event_ids`` when the
 # active branch's head is unset (the bootstrap "main" branch with origin=0 +
 # head=0). Readers compose ``id BETWEEN origin AND head`` so a value larger
 # than any possible row id behaves as "no clamp" without needing a separate
 # code path. ``2**63 - 1`` is SQLite's max signed-int — safe forever.
 _NO_HEAD_CLAMP = 2**63 - 1
 def active_branch_event_ids(conn: Connection) -> tuple[int, int]:
    """Return ``(origin_event_id, head_event_id)`` for the currently active
    branch, suitable as bounds for an ``event_log.id BETWEEN ? AND ?`` clamp
    on user-facing reads (T113).
    Defensive defaults:
    * **No active branch row** (``active_branch`` returns ``None``) — return
      ``(0, _NO_HEAD_CLAMP)`` so readers see all events. This preserves the
      Phase 4 "branches are metadata-only" contract for any code path that
      somehow runs without the migration-0013 bootstrap.
    * **Bootstrap "main"** — the canonical ``name="main", origin=0, head=0``
      row inserted by migration 0013. Production today never emits
      ``branch_head_updated`` for main, so head stays at 0 even as events
      accumulate. We treat this exact bootstrap state as "no clamp" and
      return ``(0, _NO_HEAD_CLAMP)`` so all events remain visible. This is
      what every existing test (which never configures branches) relies on.
    * **Any other branch** — return the literal ``(origin, head)`` from the
      branch row. A branch created at origin=N has head=N initially (per
      ``branch_from_event``), so ``BETWEEN N AND N`` returns just that one
      seed event until the head is bumped via ``branch_head_updated``.
    Note on the schema mismatch with the T113 spec: the spec describes
    ``head_event_id`` as nullable, but migration 0013 declared it
    ``NOT NULL DEFAULT 0``. We read head=0 on bootstrap main as the
    "unset" sentinel; non-main branches never reach head=0 in normal
    flow (creation sets head=origin, and origin=0 only for main).
    """
    branch = active_branch(conn)
    if branch is None:
        return (0, _NO_HEAD_CLAMP)
    origin = int(branch.get("origin_event_id") or 0)
    head = int(branch.get("head_event_id") or 0)
    # Bootstrap "main" sentinel — see docstring above. Detect by name +
    # both ids being 0 to avoid mis-firing on a hypothetical future
    # branch that legitimately starts at origin=0.
    if branch.get("name") == "main" and origin == 0 and head == 0:
        return (0, _NO_HEAD_CLAMP)
    return (origin, head)
 __all__ = [
    "get_branch",
    "list_branches",
    "active_branch",
    "active_branch_event_ids",
 ]
@@ -67,6 +67,29 @@ def _apply_event_expired(conn: Connection, e: Event) -> None:
    )
@on("event_status_reverted")
 def _apply_event_status_reverted(conn: Connection, e: Event) -> None:
    """T114.2: Revert an event row's status to ``prior_status``.
    Emitted by ``regenerate_assistant_turn`` when a superseded turn had
    triggered a lifecycle transition (event_started / event_completed /
    event_cancelled). The rollback step needs an inverse projection that
    sets the row's status back to whatever it was *before* the now-
    superseded transition fired.
    Unlike the forward transitions (which guard against terminal-status
    overwrites) this handler is unconditional — the entire purpose is to
    reverse a transition, including reverting from a terminal status
    (completed/cancelled) back to a non-terminal one.
    """
    p = e.payload
    conn.execute(
        "UPDATE events SET status = ?, updated_at = datetime('now') "
        "WHERE event_id = ?",
        (p["prior_status"], p["event_id"]),
    )
 def get_event(conn: Connection, event_id: str) -> dict | None:
    row = conn.execute(
        "SELECT event_id, chat_id, kind, status, props_json, planned_for, "
@@ -13,13 +13,18 @@ def _row_to_dict(conn: Connection, row: tuple) -> dict:
@on("memory_written")
 def _apply_memory_written(conn: Connection, e: Event) -> None:
    # T109 (schema 0014): persist the projecting event's id on the memory
    # row so cross-chat search results can deep-link back to the
    # originating turn (T111). Older memory rows projected before 0014
    # ran read NULL here — the column is nullable for that reason.
    p = e.payload
    conn.execute(
        "INSERT INTO memories ("
        "owner_id, chat_id, scene_id, pov_summary, "
        "witness_you, witness_host, witness_guest, "
-        "chat_clock_at, source, reliability, significance, pinned, auto_pinned"
+        "chat_clock_at, source, reliability, significance, pinned, auto_pinned, "
-        ") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
+        "event_id"
        ") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
        (
            p["owner_id"],
            p["chat_id"],
@@ -34,6 +39,7 @@ def _apply_memory_written(conn: Connection, e: Event) -> None:
            int(p.get("significance", 1)),
            int(p.get("pinned", 0)),
            int(p.get("auto_pinned", 0)),
            e.id,
        ),
    )
@@ -112,6 +118,25 @@ SIGNIFICANCE_RANK_BIAS = 0.5
 RRF_CONST = 60
 def _max_event_id(conn: Connection, owner_id: str) -> int:
    """Return the largest ``memories.id`` for ``owner_id`` (1 if none exist).
    Used as the recency-boost denominator by both ``_composite_rerank`` and
    ``_rrf_fuse_and_rerank`` (T104). The row id is a monotonic recency proxy
    — newer memories have larger ids — so dividing by the per-owner max keeps
    the boost in [0, 1] regardless of how many memories the owner has.
    Returns 1 (not 0) when the owner has no rows so callers can divide by
    the result without a guard. The "no memories" case never actually hits
    this helper because the FTS query above would have returned no rows,
    but the safe default keeps the helper trivially reusable.
    """
    row = conn.execute(
        "SELECT MAX(id) FROM memories WHERE owner_id = ?", (owner_id,)
    ).fetchone()
    return row[0] if row and row[0] else 1
 def search_memories(
    conn: Connection,
    owner_id: str,
@@ -163,6 +188,14 @@ def search_memories(
    When ``query_vector`` is None: FTS-only behaviour unchanged — all
    Phase 1-3.5 callers see the same row shape and ordering as before.
    **Row-shape contract (T104):** every returned dict carries an
    ``fts_rank`` key. For FTS hits this is the BM25 score (a negative float,
    lower-is-better). For *vector-only* hits surfaced by the fused path —
    rows that matched the query embedding but did NOT match FTS — the
    ``fts_rank`` value is ``None``. Downstream consumers must accept
    ``None`` here; do not assume ``fts_rank`` is always numeric. The
    ``composite_score`` is always a float on every returned row.
    """
    if witness_role not in _VALID_WITNESS_ROLES:
        raise ValueError(
@@ -180,12 +213,20 @@ def search_memories(
    # channel) so memories that are weak in FTS but strong in vector — and
    # vice versa — make it into the merge pool.
    over_fetch = max(k * 2, 20) if query_vector is not None else max(k * 4, 20)
    # T113: branch-scope filter on ``m.event_id`` (T109's column). Memories
    # whose ``event_id`` is NULL — projected before the 0014 schema migration
    # ran — are *included* unconditionally so the branch filter never breaks
    # legacy retrieval. Newer rows respect the active branch's bounds.
    from chat.state.branches import active_branch_event_ids
    origin, head = active_branch_event_ids(conn)
    sql = (
        f"SELECT {select_list}, memories_fts.rank AS fts_rank "
        "FROM memories_fts "
        "JOIN memories m ON m.id = memories_fts.rowid "
        f"WHERE m.owner_id = ? AND m.{witness_col} = 1 "
        "AND memories_fts MATCH ? "
        "AND (m.event_id IS NULL OR m.event_id BETWEEN ? AND ?) "
        # T57: significance multiplier biases the FTS over-fetch order. BM25
        # ``rank`` is lower-is-better, so subtracting ``significance * BIAS``
        # surfaces higher-significance rows above lower-significance rows with
@@ -194,7 +235,10 @@ def search_memories(
        "ORDER BY (memories_fts.rank - m.significance * ?) ASC "
        "LIMIT ?"
    )
-    cur = conn.execute(sql, (owner_id, query, SIGNIFICANCE_RANK_BIAS, over_fetch))
+    cur = conn.execute(
        sql,
        (owner_id, query, origin, head, SIGNIFICANCE_RANK_BIAS, over_fetch),
    )
    rows = cur.fetchall()
    # FTS-only path: preserve pre-T96 behaviour exactly.
@@ -227,10 +271,7 @@ def _composite_rerank(
    Extracted from ``search_memories`` so the no-vector path stays a single
    call and the fused path can re-use the same boost formulae after RRF.
    """
-    max_id_row = conn.execute(
+    max_id = _max_event_id(conn, owner_id)
        "SELECT MAX(id) FROM memories WHERE owner_id = ?", (owner_id,)
    ).fetchone()
    max_id = max_id_row[0] if max_id_row and max_id_row[0] else 1
    result_cols = cols + ["fts_rank"]
    enriched: list[dict] = []
@@ -301,6 +342,28 @@ def _rrf_fuse_and_rerank(
        query_vector=query_vector,
        k=vec_over_fetch,
    )
    # T113: drop vector hits that fall outside the active branch's event-id
    # range. ``vector_search`` is a generic service used elsewhere; the
    # branch filter applied to the FTS leg also has to apply here so the
    # fused result respects the same scope. Memories with NULL event_id
    # (legacy rows projected before T109's 0014 schema migration) are
    # included unconditionally — same policy as the FTS leg.
    from chat.state.branches import _NO_HEAD_CLAMP, active_branch_event_ids
    vec_origin, vec_head = active_branch_event_ids(conn)
    if vec_hits and (vec_origin > 0 or vec_head < _NO_HEAD_CLAMP):
        vec_ids = [h["memory_id"] for h in vec_hits]
        placeholders_v = ",".join("?" * len(vec_ids))
        in_range = {
            row[0]
            for row in conn.execute(
                f"SELECT id FROM memories "
                f"WHERE id IN ({placeholders_v}) "
                f"  AND (event_id IS NULL OR event_id BETWEEN ? AND ?)",
                (*vec_ids, vec_origin, vec_head),
            ).fetchall()
        }
        vec_hits = [h for h in vec_hits if h["memory_id"] in in_range]
    vec_rank_by_id: dict[int, int] = {
        hit["memory_id"]: rank for rank, hit in enumerate(vec_hits)
    }
@@ -343,10 +406,7 @@ def _rrf_fuse_and_rerank(
    # Final composite re-rank: significance + recency boosts on top of the
    # negated fusion score so the sort direction matches the FTS-only path.
-    max_id_row = conn.execute(
+    max_id = _max_event_id(conn, owner_id)
        "SELECT MAX(id) FROM memories WHERE owner_id = ?", (owner_id,)
    ).fetchone()
    max_id = max_id_row[0] if max_id_row and max_id_row[0] else 1
    result_cols = cols + ["fts_rank"]
    enriched: list[dict] = []
@@ -0,0 +1,34 @@
 {# T110.3: delete-impact modal partial.
 Rendered from :func:`chat.web.drawer.delete_preview` via a Jinja2
 TemplateResponse so HTML autoescape covers user-controllable fields
 (item.kind, item.description, notes) automatically — the prior
 f-string assembly required explicit html.escape() calls (T110.2)
 which become redundant under autoescape.
 Inputs:
  ``chat_id`` — the URL chat id (used to build the confirm form action).
  ``impact``  — an :class:`~chat.services.delete_impact.ImpactReport`.
 #}
 <div class="delete-impact-modal">
  <h3>Delete event {{ impact.target_event_id }}?</h3>
  <p>This will discard {{ impact.cascading|length }} events. Cascade:</p>
  <ul class="delete-impact-cascade">
    {% if impact.cascading %}
      {% for item in impact.cascading %}
        <li><strong>{{ item.kind }}</strong>: {{ item.description }}</li>
      {% endfor %}
    {% else %}
      <li>none</li>
    {% endif %}
  </ul>
  <ul class="delete-impact-notes">
    {% for note in impact.notes %}
      <li>{{ note }}</li>
    {% endfor %}
  </ul>
  <form hx-post="/chats/{{ chat_id }}/drawer/turn/delete/{{ impact.target_event_id }}"
        hx-target="#drawer" hx-swap="innerHTML">
    <button type="submit">Confirm delete</button>
  </form>
 </div>
@@ -547,6 +547,25 @@
        </ul>
      </details>
    {% endif %}
    {# T110.4: bulk significance re-rate. Move every memory in this chat
       at level_from to level_to with one manual_edit event per row, so
       the audit trail stays per-memory. #}
    <details class="bulk-significance">
      <summary>Bulk re-rate significance</summary>
      <form class="inline-edit"
            hx-post="/chats/{{ chat.id }}/drawer/memory/significance/bulk"
            hx-target="#drawer" hx-swap="innerHTML">
        <label>
          From:
          <input type="number" name="level_from" min="0" max="3" value="0" required>
        </label>
        <label>
          To:
          <input type="number" name="level_to" min="0" max="3" value="1" required>
        </label>
        <button type="submit">Re-rate all</button>
      </form>
    </details>
  </section>
  <section class="drawer-section">
@@ -21,14 +21,29 @@
  <ul class="search-results">
    {% for r in results %}
    <li class="search-result">
-      <a class="search-result-link" href="/chats/{{ r.chat_id }}">
+      {# T111.2: deep-link to the originating turn via the
         ``id="turn-{event_id}"`` anchor stamped by Phase 3.5 T86.
         ``event_id`` may be NULL for memory rows projected before the
         0014 migration ran (T109 did not backfill historical rows); in
         that case fall back to a chat-level link with no anchor so we
         never emit ``#turn-None``. #}
      <a class="search-result-link"
         href="/chats/{{ r.chat_id }}{% if r.event_id %}#turn-{{ r.event_id }}{% endif %}">
        <div class="search-result-meta muted">
          <strong>{{ r.owner_name }}</strong>
          <span>&middot; {{ r.chat_id }}</span>
          {% if r.chat_name %}<span>&middot; {{ r.chat_name }}</span>{% endif %}
          {% if r.scene_label %}<span>&middot; scene {{ r.scene_label }}</span>{% endif %}
        </div>
-        <div class="search-result-summary">{{ r.pov_summary }}</div>
+        {# T111.1: ``r.snippet`` is the FTS5 ``snippet()`` excerpt with
           each match wrapped in ``<mark>...</mark>``. ``|safe`` is
           required so the marker tags survive Jinja's auto-escape; the
           snippet is built by SQLite from indexed text, so the only
           HTML in the string is the ``<mark>`` we configured (any
           special chars from the source content are passed through as
           literal text, NOT as HTML). This is the only ``|safe`` filter
           on the page — chat_id, owner_name, etc. remain auto-escaped. #}
        <div class="search-result-summary">{{ r.snippet|safe }}</div>
      </a>
    </li>
    {% endfor %}
@@ -411,6 +411,64 @@ async def edit_memory_significance(
    return await drawer(chat_id, request, conn)
@router.post(
    "/chats/{chat_id}/drawer/memory/significance/bulk",
    response_class=HTMLResponse,
 )
 async def bulk_re_rate_significance(
    chat_id: str,
    request: Request,
    level_from: int = Form(...),
    level_to: int = Form(...),
    conn=Depends(get_conn),
 ):
    """T110.4: bulk re-rate every memory in this chat at ``level_from``
    to ``level_to``.
    Fans out into one ``manual_edit`` event per matching memory rather
    than a single bulk event so the §6.4 audit trail stays per-row —
    each affected memory carries its own ``prior_value -> new_value``
    snapshot, so an inverse edit can restore an individual row without
    needing to inspect a bulk payload's member list. The drawer's
    significance-distribution panel surfaces the new buckets on the
    refreshed partial.
    Both levels are clamped to 0..3 (matching ``edit_memory_significance``)
    and a no-op (``level_from == level_to``) is rejected with 400 so a
    misclick can't pad the event log with empty edits.
    """
    chat = get_chat(conn, chat_id)
    if chat is None:
        raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
    lf = max(0, min(3, int(level_from)))
    lt = max(0, min(3, int(level_to)))
    if lf == lt:
        raise HTTPException(
            status_code=400,
            detail=f"level_from and level_to must differ (both = {lf})",
        )
    rows = conn.execute(
        "SELECT id FROM memories WHERE chat_id = ? AND significance = ? "
        "ORDER BY id ASC",
        (chat_id, lf),
    ).fetchall()
    for row in rows:
        memory_id = int(row[0])
        append_and_apply(
            conn,
            kind="manual_edit",
            payload={
                "target_kind": "memory_significance",
                "target_id": memory_id,
                "prior_value": lf,
                "new_value": lt,
            },
        )
    return await drawer(chat_id, request, conn)
@router.post(
    "/chats/{chat_id}/drawer/memory/{memory_id}/pin",
    response_class=HTMLResponse,
@@ -1234,28 +1292,18 @@ async def delete_preview(
    report = compute_delete_impact(conn, target_event_id=int(event_id))
-    # Build the modal HTML directly — the impact report is small and
+    # T110.3: render via the ``_delete_impact_modal.html`` Jinja partial
-    # reusing the drawer template would require a fragment include just
+    # so HTML autoescape covers user-controllable fields (item.kind,
-    # for this surface. Mirrors the rewind-preview style in
+    # item.description, notes) automatically. The prior implementation
-    # :func:`chat.web.turns.rewind_preview`.
+    # built the modal HTML via raw f-string concatenation and required
-    items_html = "".join(
+    # explicit ``html.escape()`` calls (T110.2) on each interpolated
-        f"<li><strong>{item.kind}</strong>: {item.description}</li>"
+    # field; under autoescape those calls become redundant. Mirrors the
-        for item in report.cascading
+    # rewind-preview style in :func:`chat.web.turns.rewind_preview`.
    return TEMPLATES.TemplateResponse(
        request,
        "_delete_impact_modal.html",
        {"chat_id": chat_id, "impact": report},
    )
    notes_html = "".join(f"<li>{note}</li>" for note in report.notes)
    body = (
        "<div class='delete-impact-modal'>"
        f"<h3>Delete event {report.target_event_id}?</h3>"
        f"<p>This will discard {len(report.cascading)} events. Cascade:</p>"
        f"<ul class='delete-impact-cascade'>{items_html or '<li>none</li>'}</ul>"
        f"<ul class='delete-impact-notes'>{notes_html}</ul>"
        f"<form hx-post='/chats/{chat_id}/drawer/turn/delete/{report.target_event_id}' "
        "hx-target='#drawer' hx-swap='innerHTML'>"
        "<button type='submit'>Confirm delete</button>"
        "</form>"
        "</div>"
    )
    return HTMLResponse(body)
@router.post(
@@ -1278,7 +1326,19 @@ async def delete_turn(
    A snapshot is taken before truncation (inside ``execute_rewind``)
    so the user can recover via the snapshot index.
    T110.1 guards ``event_id <= 0``: a stale tab or hand-crafted request
    posting ``event_id=0`` would otherwise compute ``after_event_id=-1``
    and silently truncate the entire log. ``id`` is auto-assigned by
    SQLite starting at 1 so any caller's "real" id is always >= 1; a
    zero or negative value can only mean a client bug, surfaced as 400.
    """
    if int(event_id) <= 0:
        raise HTTPException(
            status_code=400,
            detail=f"event_id must be a positive integer, got {event_id}",
        )
    chat = get_chat(conn, chat_id)
    if chat is None:
        raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
@@ -71,18 +71,27 @@ def _read_recent_meanwhile_dialogue(
    that already match — avoids an unbounded scan as ``event_log``
    grows. The user-side rows match on chat_id only since they aren't
    tagged with a scene id (they ride the chat-wide log).
    T113: clamp by the active branch's ``[origin, head]`` event-id range
    so meanwhile prompt context respects the user's current branch.
    Bootstrap-main and "no active branch" both fall through to ``(0,
    BIG_INT)`` — no functional change for the metadata-only Phase 4 era.
    """
    from chat.state.branches import active_branch_event_ids
    origin, head = active_branch_event_ids(conn)
    cur = conn.execute(
        "SELECT id, kind, payload_json FROM event_log "
        "WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
        "  AND superseded_by IS NULL AND hidden = 0 "
        "  AND id BETWEEN ? AND ? "
        "  AND json_extract(payload_json, '$.chat_id') = ? "
        "  AND ("
        "    kind IN ('user_turn', 'user_turn_edit') "
        "    OR json_extract(payload_json, '$.meanwhile_scene_id') = ?"
        "  ) "
        "ORDER BY id DESC LIMIT ?",
-        (chat_id, scene_id, limit),
+        (origin, head, chat_id, scene_id, limit),
    )
    rows = cur.fetchall()
    rows.reverse()
@@ -14,6 +14,12 @@ For each match we hydrate just enough metadata to render a row:
 * the originating scene title when one exists,
 * and the ``pov_summary`` itself.
 T106 (Phase 4.5): hydration is batched. Pre-T106 the route called
 ``get_bot``/``get_chat``/``get_scene`` once per result row — N+1 with
 ``DEFAULT_SEARCH_K=50`` meaning up to 150 individual SELECTs per page
 load. We now collect distinct ids first and fan-in via three
 ``WHERE id IN (...)`` queries, then map back per row.
 We deliberately keep this module synchronous and template-only — no
 HTMX swaps, no JSON API — because the search box is a "leave the
 current chat to look something up" surface, not an inline drawer.
@@ -21,7 +27,9 @@ current chat to look something up" surface, not an inline drawer.
 from __future__ import annotations
 import json
 from pathlib import Path
 from sqlite3 import Connection
 from fastapi import APIRouter, Depends, Request
 from fastapi.responses import HTMLResponse
@@ -36,29 +44,145 @@ TEMPLATES = Jinja2Templates(
    directory=str(Path(__file__).resolve().parent.parent / "templates")
 )
 #: Maximum cross-chat FTS matches surfaced per ``/search`` page load.
 #: Extracted as a module-level constant (T106) so the cap is tunable
 #: without touching the route body. ``search_all_memories`` itself
 #: defaults to a smaller ``k=20``; we override here because the
 #: top-bar search is a "scan everything I've seen" surface, not an
 #: inline drawer.
 DEFAULT_SEARCH_K = 50
 router = APIRouter()
 def _fetch_bots_by_ids(conn: Connection, ids: set[str]) -> dict[str, dict]:
    """Batched sibling of :func:`chat.state.entities.get_bot`.
    Inlined here (not exported from ``state.entities``) to keep T106's
    scope confined to ``search.py`` per the Phase 4.5 plan. Returns
    ``{bot_id: bot_dict}`` for every id present in ``ids``; ids with
    no matching row are simply absent from the map (the caller falls
    back to the raw id string the same way it did pre-T106).
    Empty ``ids`` short-circuits to ``{}`` because SQLite rejects
    ``WHERE id IN ()`` as a syntax error.
    """
    if not ids:
        return {}
    placeholders = ",".join("?" * len(ids))
    cols = [c[1] for c in conn.execute("PRAGMA table_info(bots)").fetchall()]
    rows = conn.execute(
        f"SELECT * FROM bots WHERE id IN ({placeholders})",
        tuple(ids),
    ).fetchall()
    out: dict[str, dict] = {}
    for row in rows:
        d = dict(zip(cols, row))
        d["voice_samples"] = json.loads(d.pop("voice_samples_json"))
        d["traits"] = json.loads(d.pop("traits_json"))
        out[d["id"]] = d
    return out
 def _fetch_chats_by_ids(conn: Connection, ids: set[str]) -> dict[str, dict]:
    """Batched sibling of :func:`chat.state.world.get_chat`.
    Mirrors that helper's ``chats``/``chat_state`` JOIN so the returned
    dicts have the same shape (``narrative_anchor``, ``time``,
    ``weather``, ``active_scene_id``, etc.). Empty ``ids`` returns
    ``{}`` to dodge the ``IN ()`` syntax error.
    """
    if not ids:
        return {}
    placeholders = ",".join("?" * len(ids))
    rows = conn.execute(
        "SELECT c.id, c.host_bot_id, c.guest_bot_id, c.created_at, "
        "       s.time, s.weather, s.active_scene_id, s.narrative_anchor "
        f"FROM chats c JOIN chat_state s ON s.chat_id = c.id "
        f"WHERE c.id IN ({placeholders})",
        tuple(ids),
    ).fetchall()
    return {
        row[0]: {
            "id": row[0],
            "host_bot_id": row[1],
            "guest_bot_id": row[2],
            "created_at": row[3],
            "time": row[4],
            "weather": row[5],
            "active_scene_id": row[6],
            "narrative_anchor": row[7],
        }
        for row in rows
    }
 def _fetch_scenes_by_ids(conn: Connection, ids: set[int]) -> dict[int, dict]:
    """Batched sibling of :func:`chat.state.world.get_scene`.
    Returns ``{scene_id: scene_dict}`` with ``participants`` already
    JSON-decoded so callers see the same shape as the per-row helper.
    Empty ``ids`` returns ``{}``.
    """
    if not ids:
        return {}
    placeholders = ",".join("?" * len(ids))
    cols = [c[1] for c in conn.execute("PRAGMA table_info(scenes)").fetchall()]
    rows = conn.execute(
        f"SELECT * FROM scenes WHERE id IN ({placeholders})",
        tuple(ids),
    ).fetchall()
    out: dict[int, dict] = {}
    for row in rows:
        d = dict(zip(cols, row))
        d["participants"] = json.loads(d.pop("participants_json"))
        out[d["id"]] = d
    return out
@router.get("/search", response_class=HTMLResponse)
 async def search(request: Request, q: str = "", conn=Depends(get_conn)):
-    """Render ``search.html`` with up to 50 cross-chat FTS matches.
+    """Render ``search.html`` with up to :data:`DEFAULT_SEARCH_K` matches.
    ``q`` is intentionally allowed to be empty — that path renders the
    page's "enter a query" placeholder rather than a 400, because the
    top-bar form submits to this URL even with an empty input. T93's
    service short-circuits whitespace-only queries to ``[]`` so there
    is no FTS5 ``MATCH ''`` syntax error to guard against here.
    """
    raw_results = search_all_memories(conn, query=q, k=50) if q else []
-    # Hydrate display fields per row. We do this in the route (not the
+    Hydration (T106) is batched: rather than calling ``get_bot`` /
-    # service) so the service stays a pure FTS shim that other UIs
+    ``get_chat`` / ``get_scene`` per row (worst case 3 * k individual
-    # can reuse.
+    SELECTs), we collect distinct ids and issue one ``IN (...)`` query
    per entity kind, then map back during the row build. ``get_bot``
    et al. remain imported for test-time monkeypatching but are no
    longer invoked on the hot path.
    """
    raw_results = (
        search_all_memories(conn, query=q, k=DEFAULT_SEARCH_K) if q else []
    )
    # Collect distinct ids up front so the IN-list queries dedupe (a
    # popular bot or scene shows up many times across the result set).
    bot_ids: set[str] = {r["owner_id"] for r in raw_results if r["owner_id"]}
    chat_ids: set[str] = {r["chat_id"] for r in raw_results if r["chat_id"]}
    scene_ids: set[int] = {
        r["scene_id"] for r in raw_results if r["scene_id"]
    }
    bots_by_id = _fetch_bots_by_ids(conn, bot_ids)
    chats_by_id = _fetch_chats_by_ids(conn, chat_ids)
    scenes_by_id = _fetch_scenes_by_ids(conn, scene_ids)
    # Hydrate display fields per row from the batched maps. We do this
    # in the route (not the service) so the service stays a pure FTS
    # shim that other UIs can reuse.
    results = []
    for row in raw_results:
-        bot = get_bot(conn, row["owner_id"])
+        bot = bots_by_id.get(row["owner_id"])
-        chat = get_chat(conn, row["chat_id"])
+        chat = chats_by_id.get(row["chat_id"])
-        scene = get_scene(conn, row["scene_id"]) if row["scene_id"] else None
+        scene = (
            scenes_by_id.get(row["scene_id"]) if row["scene_id"] else None
        )
        results.append(
            {
                "memory_id": row["memory_id"],
@@ -69,6 +193,13 @@ async def search(request: Request, q: str = "", conn=Depends(get_conn)):
                    chat.get("narrative_anchor") if chat else None
                ),
                "scene_id": row["scene_id"],
                # T111.2: event_id deep-links to the originating turn
                # via the ``id="turn-{event_id}"`` anchor that Phase 3.5
                # T86 stamps on each turn DOM node. May be ``None`` for
                # memory rows projected before the 0014 migration ran
                # (T109 did not backfill historical rows); the template
                # falls back to a chat-level link in that case.
                "event_id": row["event_id"],
                # Scenes have no ``title`` column today; surface the
                # ``started_at`` timestamp as a human-friendly label
                # when a scene is set, otherwise leave it blank.
@@ -76,6 +207,14 @@ async def search(request: Request, q: str = "", conn=Depends(get_conn)):
                    scene.get("started_at") if scene else None
                ),
                "pov_summary": row["pov_summary"],
                # T111.1: ``snippet`` is the FTS5 windowed excerpt with
                # ``<mark>`` tags around each match. Falls back to the
                # full ``pov_summary`` if the row lacks a snippet (which
                # shouldn't happen on this code path because every
                # ``raw_results`` row came from a MATCH query, but we
                # guard defensively so the template never renders
                # ``None``).
                "snippet": row.get("snippet") or row["pov_summary"],
                "significance": row["significance"],
                "ts": row["ts"],
            }
@@ -8,20 +8,27 @@ Routes:
 * ``GET  /snapshots``                    list all snapshots (both kinds)
 * ``POST /snapshots/take``               take a periodic snapshot now
-* ``POST /snapshots/restore/{id}``       restore (requires matching ``confirm_id``)
+* ``POST /snapshots/restore/{id}``       restore (requires matching ``confirm_id`` and ``kind``)
 * ``GET  /snapshots/{id}/preview``       show metadata + delta vs current
 The ``snapshot_id`` is the filename stem (the UTC timestamp written by
 :func:`chat.services.snapshot.take_snapshot`) — there's no separate UUID,
 and the timestamp filename is already unique per snapshot kind. Both
 periodic and rewind snapshots share the same id space lookup-wise, so
-the restore + preview routes accept ``kind`` as a form/query param to
+the restore + preview routes require ``kind`` as a form/query param to
-disambiguate.
+disambiguate (a missing/empty ``kind`` is a 400, not a silent default).
 Note on ``created_at`` mtime drift: the listing's ``created_at`` comes
 from the file's mtime, not the encoded filename timestamp. ``cp -p``
 preserves mtime, but plain ``cp`` resets it to "now" — so a copied
 snapshot can show a misleading ``created_at`` while its filename still
 reflects the original UTC capture time.
 """
 from __future__ import annotations
 import json
 from datetime import datetime, timezone
 from pathlib import Path
 from fastapi import APIRouter, Depends, Form, HTTPException, Request
@@ -52,8 +59,6 @@ def _list_all_snapshots(data_dir: Path) -> list[dict]:
    ``last_event_id`` (parsed from the JSON body — small enough that
    listing isn't a performance concern for the handful of files we keep).
    """
    from datetime import datetime, timezone
    rows: list[dict] = []
    for kind in SNAPSHOT_KINDS:
        snap_dir = data_dir / "snapshots" / kind
@@ -85,12 +90,26 @@ def _list_all_snapshots(data_dir: Path) -> list[dict]:
    return rows
 def _require_kind(kind: str) -> str:
    """Reject missing/empty/unknown ``kind`` with 400.
    Defaulting silently to ``"periodic"`` made rewind-snapshot lookups
    appear as 404s, which is confusing — make the client always state
    the kind explicitly.
    """
    if not kind or kind not in SNAPSHOT_KINDS:
        raise HTTPException(
            status_code=400,
            detail=f"kind must be one of {SNAPSHOT_KINDS}",
        )
    return kind
 def _resolve_snapshot_path(
    data_dir: Path, snapshot_id: str, kind: str
 ) -> Path:
    """Map an ``(id, kind)`` pair to the on-disk file, or 404."""
-    if kind not in SNAPSHOT_KINDS:
+    _require_kind(kind)
        raise HTTPException(status_code=400, detail=f"unknown kind: {kind}")
    path = data_dir / "snapshots" / kind / f"{snapshot_id}.json"
    if not path.exists():
        raise HTTPException(status_code=404, detail="snapshot not found")
@@ -127,7 +146,7 @@ async def snapshots_restore(
    snapshot_id: str,
    request: Request,
    confirm_id: str = Form(""),
-    kind: str = Form("periodic"),
+    kind: str = Form(""),
    conn=Depends(get_conn),
 ):
    """Hard-confirm restore: ``confirm_id`` must equal the path id.
@@ -135,7 +154,11 @@ async def snapshots_restore(
    Mismatched confirm → 400 (without touching the DB). On match, the
    existing :func:`restore_from_snapshot` clears projected tables and
    re-loads them from the dump.
    ``kind`` is required (must be ``"periodic"`` or ``"rewind"``) — a
    missing or empty value 400s rather than silently defaulting.
    """
    _require_kind(kind)
    if confirm_id != snapshot_id:
        raise HTTPException(
            status_code=400,
@@ -151,7 +174,7 @@ async def snapshots_restore(
 async def snapshots_preview(
    snapshot_id: str,
    request: Request,
-    kind: str = "periodic",
+    kind: str = "",
    conn=Depends(get_conn),
 ):
    """Show snapshot metadata + a basic delta against the current event log.
@@ -159,7 +182,10 @@ async def snapshots_preview(
    Phase 4 keeps this simple: the snapshot's ``last_event_id`` plus the
    current ``MAX(event_log.id)`` is enough to tell the user how far the
    log has moved on. A richer per-table diff is a Phase 4.5+ concern.
    ``kind`` is required — see :func:`snapshots_restore`.
    """
    _require_kind(kind)
    settings = request.app.state.settings
    path = _resolve_snapshot_path(settings.data_dir, snapshot_id, kind)
    dump = json.loads(path.read_text())
@@ -812,6 +812,14 @@ async def post_turn(
                    payload={
                        "event_id": transition.event_id,
                        "started_at": chat.get("time"),
                        # T114.1: back-reference to the assistant_turn that
                        # triggered this transition. Regenerate uses this
                        # to roll back lifecycle transitions when the turn
                        # is superseded. Forward-only — older events
                        # without this field are skipped by rollback.
                        "triggered_by_assistant_turn_id": (
                            primary_assistant_event_id
                        ),
                    },
                )
            elif transition.new_status == "completed":
@@ -821,6 +829,10 @@ async def post_turn(
                    payload={
                        "event_id": transition.event_id,
                        "completed_at": chat.get("time"),
                        # T114.1: back-reference (see above).
                        "triggered_by_assistant_turn_id": (
                            primary_assistant_event_id
                        ),
                    },
                )
                # Run promotion inline so the artifact-emitting events
@@ -842,6 +854,10 @@ async def post_turn(
                    payload={
                        "event_id": transition.event_id,
                        "completed_at": chat.get("time"),
                        # T114.1: back-reference (see above).
                        "triggered_by_assistant_turn_id": (
                            primary_assistant_event_id
                        ),
                    },
                )
            # Any other ``new_status`` value falls through silently —
@@ -873,6 +889,20 @@ async def post_turn(
    # mid-stream still meant to close the scene — the cancelled bot
    # beat doesn't invalidate that intent. Pinned by
    # test_cancelled_turn_still_closes_scene_when_user_prose_signals_close.
    #
    # T108 NOTE — the in-memory append order is correct, but the cancel
    # path re-raises ``CancelledError`` at the end of ``post_turn``
    # (see step 11 below). The ``open_db`` dependency teardown skips
    # ``conn.commit()`` when the consumer raises, which means in
    # production a genuine cancel currently rolls back ALL post-cancel
    # writes — including this scene_closed event, the truncated
    # assistant_turn record, edge updates, and per-POV summaries. The
    # T74.3 regression test passes only because of a missing
    # ``import asyncio`` in the test module: the inline mock raises
    # ``NameError`` instead of ``CancelledError``, which is caught by
    # the ``except Exception:`` branch and leaves ``cancelled=False``,
    # so the function returns 204 normally and the commit fires. This
    # is a transactional bug deferred for triage (T108 report).
    if scene is not None and prose.strip():
        container = None
        if scene.get("container_id") is not None:
@@ -522,6 +522,8 @@ Written per witness when a scene closes. Different details, different interpreta
 **Status: shipped 2026-04-27** (T88–T102, 15 tasks across 8 waves; +70 tests). See "Phase 4 status" in CLAUDE.md for the per-task breakdown. Vector retrieval shipped via pure-Python cosine over a JSON-blob embeddings table (sqlite-vec deferred — host Python lacks loadable extensions); branching is data-model + drawer UI; significance review, hide-from-view soft delete, surgical delete with cascade preview, snapshot UX, and cross-chat search all surface from the drawer or top-bar.
 **Phase 4.5 cleanup: shipped 2026-04-27** (T103–T118, 13 of 14 planned tasks; T115 sqlite-vec swap deferred to Phase 5 due to host Python lacking `enable_load_extension`; +~44 tests; schema baseline now 14). See "Phase 4.5 status" in CLAUDE.md for the per-task breakdown — notable shipped: real embedding model swap path (`LLMClient.embed()` + `--re-embed-all`), branching read-side filter (`active_branch_event_ids`), regenerate lifecycle rollback (`event_status_reverted`), FTS snippet highlighting + deep-link to turn (`memories.event_id`), bulk significance re-rate.
 - Vector retrieval (sqlite-vss or sqlite-vec).
 - Branching UI.
 - Drawer-edit on every field.
@@ -0,0 +1,724 @@
 # Roleplay Engine — Phase 4.5 Cleanup Plan
 > **For Claude:** REQUIRED SUB-SKILL: Use `superpowers-extended-cc:executing-plans` to implement this plan task-by-task. Use the parallel-dispatch pattern documented under "Parallel-Execution Strategy" for parallel waves.
 **Goal:** Burn down all 24 items in `CLAUDE.md` §"Phase 4.5 / 5 backlog". Mix of small defensive cleanups (most), three big features (real embedding model swap, branching read-side filter, lifecycle rollback in regenerate), one environment-dependent feature (sqlite-vec swap), and the long-deferred carry-overs (scene-close-on-cancel revisit, structured test-fixture builder).
 **Architecture:** No new architecture. Two new schema migrations (0014 schema polish, 0015 sqlite-vec virtual tables). New external dependency optional (`apsw` if Python rebuild isn't possible). All other changes are polish / refactor / observability.
 **Tech Stack:**
 - Existing — same as Phase 4.
 - **OPTIONAL:** rebuild Python with `--enable-loadable-sqlite-extensions` OR install `apsw` to enable T115 sqlite-vec swap. T115 is the only task that requires this; the other 13 tasks land without it. If neither is available, T115 is deferred to Phase 5.
 **Source-of-truth references:**
 - Backlog: [`CLAUDE.md`](../../CLAUDE.md) §"Phase 4.5 / 5 backlog" (24 items grouped by review source + deferred).
 - Phase 3.5 / Phase 2.5 cleanup plans (pattern reference): [2026-04-26-v3.5-phase3.5-cleanup.md](2026-04-26-v3.5-phase3.5-cleanup.md), [2026-04-26-v2.5-phase2.5-cleanup.md](2026-04-26-v2.5-phase2.5-cleanup.md).
 - Conventions: [`CLAUDE.md`](../../CLAUDE.md) §"Behavioral defaults" + §"Phase 4 status".
 ---
 ## Pre-flight
 **Branch:** create `phase-4.5` from the latest `main`:
 ```bash
 git checkout main && git pull && git checkout -b phase-4.5
 ```
 **Schema baseline:** Phase 4 leaves the DB at version 13. Phase 4.5 adds two migrations: `0014_phase45_schema.sql` (T109) and `0015_vec0_virtual_tables.sql` (T115 — only lands if T115 ships). Final schema version: 14 or 15.
 **Optional pre-flight for T115 (sqlite-vec swap):**
 The host Python build needs `enable_load_extension`. Two options:
 1. **Rebuild Python** via pyenv with `PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" pyenv install 3.12.0 --force` and recreate the venv.
 2. **Add `apsw`** as a dependency and migrate `chat/db/connection.py` to use `apsw.Connection` (significant refactor — the entire codebase uses stdlib `sqlite3`).
 If neither is acceptable, **defer T115** to Phase 5 and ship Phase 4.5 with 13 tasks instead of 14. The other tasks are unaffected.
 **Pinned non-negotiables (carried forward):**
 - State changes go through the event log. Use `append_and_apply` for the live path.
 - Witness filter every memory read at SQL level.
 - TDD: every task starts with a failing test (or a regression test pinning existing contract before refactor).
 - One commit per task minimum. Bundled tasks split internally.
 **Verification before claiming done:** Use `superpowers-extended-cc:verification-before-completion` — run the test command, paste actual output.
 ---
 ## Backlog item → task mapping
 24 items consolidated into 14 tasks by **file ownership**:
 | # | Item | Source | Task |
 |---|------|--------|------|
 | 1 | `embeddings` FK lacks `ON DELETE CASCADE` | T88 | **T109** (schema migration) |
 | 2 | `list_branches(chat_id=...)` global-branch leak — document | T89 | **T103** |
 | 3 | Branch-switch silently leaves zero active — log warning | T89 | **T103** |
 | 4 | Real embedding model swap | T91 / deferred | **T112** |
 | 5 | `timeout_s` fallback-path logging | T91 | **T107** |
 | 6 | Duplicate `MAX(id)` lookup in retrieval ranking | T96 | **T104** |
 | 7 | `fts_rank=None` for vector-only rows — document | T96 | **T104** |
 | 8 | `event_id <= 0` guard in `delete_turn` | T98 | **T110** |
 | 9 | `html.escape()` on delete-impact modal output | T98 | **T110** |
 | 10 | Extract delete-impact modal to Jinja partial | T98 | **T110** |
 | 11 | Hoist `datetime`/`timezone` imports in `snapshots.py` | T99 | **T105** |
 | 12 | Strict `kind` validation in snapshot routes | T99 | **T105** |
 | 13 | `created_at` from file mtime — document drift risk | T99 | **T105** |
 | 14 | Hardcoded `k=50` → module constant | T100 | **T106** |
 | 15 | N+1 lookups in search results | T100 | **T106** |
 | 16 | FTS highlighting via `snippet()` | T100 | **T111** |
 | 17 | Result links chat-level only — add deep-link via memories.event_id | T100 | **T109** + **T111** |
 | 18 | sqlite-vec swap when host Python supports loadable extensions | deferred | **T115** |
 | 19 | Branching read-side filter (consult `is_active`) | deferred | **T113** |
 | 20 | Bulk significance re-rate in drawer | deferred | **T110** |
 | 21 | Vector index optimization (HNSW) | deferred | **T115** (post-ship note) |
 | 22 | Scene-close-on-cancel UX revisit | Phase 2.5 carry-over | **T108** |
 | 23 | Cross-feature canned-queue brittleness fixture builder | Phase 3 carry-over | **T116** |
 | 24 | Full lifecycle-rollback in regenerate | Phase 3.5 carry-over | **T114** |
 ---
 ## Parallel-Execution Strategy
 Same pattern as Phase 3.5 / Phase 2.5 / Phase 4. Nine waves: parallel within each wave (file-disjoint), serial across waves.
 ### How to dispatch a wave in parallel
 Use the **Agent tool with `isolation: "worktree"`**. (If the controlling session's working directory is **not** the chat repo, create worktrees manually with `git worktree add .worktrees/<wave>-<task> -b <wave>/<task> phase-4.5`.)
 ### After a wave completes
 1. Each subagent returns its worktree path and commit SHA(s).
 2. **Run a spec + code-quality reviewer subagent on each completed task.** Combined review acceptable for trivial tasks (T103–T108); separate spec + quality reviewers for big tasks (T112, T113, T114, T115).
 3. **Merge the wave into `phase-4.5`** in any order (file-disjointness guarantees no conflict). Use `--no-ff`.
 4. **Run the full test suite** on the merged `phase-4.5`.
 5. **Push `phase-4.5`** to gitea.
 6. Optionally clean up worktrees.
 ### Conflict prevention checklist
 For each parallel wave, verify the **Files** sections of all tasks have **no overlapping paths**. Hot files in this plan (each owned by exactly one task): `chat/state/memory.py`, `chat/web/drawer.py`, `chat/web/search.py`, `chat/services/regenerate.py`, `chat/services/turn_common.py`, `chat/services/embeddings.py`, `chat/db/migrations/`.
 ### Why each wave is parallel-safe
 | Wave | Tasks | Hot files | Disjoint? |
 |------|-------|-----------|-----------|
 | 1 | T103, T104, T105, T106, T107, T108 | 6 different files; no overlap | ✅ |
 | 2 | T109 | new migration + minor projector update | (single task) |
 | 3 | T110 | `chat/web/drawer.py` (bundle) | (single task) |
 | 4 | T111 | `chat/services/cross_chat_search.py` + `chat/web/search.py` + template | (single task; depends on T109) |
 | 5 | T112 | `chat/services/embeddings.py` + `chat/llm/*.py` (Protocol + Featherless + Mock) | (single task) |
 | 6 | T113 | `chat/services/turn_common.py` + multiple readers (cross-cutting) | (single task) |
 | 7 | T114 | `chat/services/regenerate.py` + projector handler | (single task) |
 | 8 | T115 | new migration + `chat/services/vector_search.py` + `chat/db/connection.py` | (single task; environmental) |
 | 9 | T116, T117, T118 | new test fixture file (T116); new test file (T117); CLAUDE.md (T118) | ✅ |
 ---
 ## Task overview
 ```
 Wave 1 ─┬─ T103: branches polish (global-branch doc + branch-switch warning)
        ├─ T104: state/memory.py polish (DRY MAX(id) + fts_rank doc)
        ├─ T105: snapshots.py polish (datetime hoist + kind validation + mtime doc)
        ├─ T106: search.py polish (k constant + N+1 batched lookups)
        ├─ T107: embeddings.py timeout_s fallback-path logging
        └─ T108: scene-close-on-cancel UX revisit (pin behavior with regression test)
 Wave 2 ─── T109: 0014 schema migration (FK CASCADE + memories.event_id column)
 Wave 3 ─── T110: drawer Phase 4.5 bundle (event_id guard + html.escape + modal partial + bulk sig re-rate)
 Wave 4 ─── T111: search UX enhancements (FTS snippet() highlighting + deep-link via memories.event_id)
 Wave 5 ─── T112: real embedding model swap (LLMClient.embed protocol + Featherless impl + generate_embedding routing + backfill)
 Wave 6 ─── T113: branching read-side filter (event readers consult is_active branch range)
 Wave 7 ─── T114: regenerate lifecycle rollback (back-reference field + compensating events on supersede)
 Wave 8 ─── T115: sqlite-vec swap (vec0 virtual tables + MATCH-based vector_search) [ENVIRONMENTAL — see pre-flight]
 Wave 9 ─┬─ T116: structured test-fixture builder (canned-queue brittleness)
        ├─ T117: Phase 4.5 cross-feature integration tests
        └─ T118: docs sweep — Phase 4.5 status, prune backlog, capture Phase 5 residuals
 ```
 Critical path: 9 sequential merge points. Total tasks: 14 (or 13 if T115 deferred). Parallelism: Waves 1 (6-way) and 9 (3-way) dispatch concurrently. Waves 2–8 are single-task by hot-file constraint.
 ---
 ## Wave 1 — Independent small fixes (parallel, 6 tasks)
 All trivial, file-disjoint. Each is 1-line + 1-test or similar.
 ### Task 103: branches polish
 **Files:**
 - Modify: `chat/state/branches.py`
 - Modify: `tests/test_branches_state.py`
 **Spec (2 sub-fixes, single commit):**
 1. **Document global-branch leak**: `list_branches(chat_id=...)` filter `chat_id = ? OR chat_id IS NULL` returns global/null-chat branches (like "main") in every chat scope. Add a docstring note explaining this is intentional ("main" is global by design; per-chat branches are scoped).
 2. **Warn on branch-switch to nonexistent name**: in `_apply_branch_switched`, before the SQL UPDATE, check if a branch with the given name exists. If not, emit `logging.getLogger(__name__).warning(...)` rather than silently leaving zero active branches.
 **Test:** `test_branch_switched_unknown_name_warns` — capture log via `caplog`, append `branch_switched` for nonexistent name, assert warning message + no active branch (existing behavior preserved, just observable).
 **Commit:** `chore: branches polish — global-leak docs + unknown-name warning (T103)`.
 ---
 ### Task 104: state/memory.py polish
 **Files:**
 - Modify: `chat/state/memory.py`
 - Modify: `tests/test_memory_search.py` (no new tests; just add docstring assertions if needed)
 **Spec (2 sub-fixes):**
 1. **DRY `MAX(id)` lookup**: `_composite_rerank` (Phase 3.5 T57) and `_rrf_fuse_and_rerank` (Phase 4 T96) both query `SELECT MAX(id) FROM event_log` for the recency boost. Extract a `_max_event_id(conn)` helper.
 2. **`fts_rank=None` documentation**: search_memories docstring should note that vector-only rows have `fts_rank=None`. Downstream consumers must accept None (they currently do, but contract is implicit).
 **Test:** existing tests cover both via the public API; no new test needed unless docstring assertion is desired.
 **Commit:** `chore: memory.py DRY MAX(id) helper + document fts_rank=None contract (T104)`.
 ---
 ### Task 105: snapshots.py polish
 **Files:**
 - Modify: `chat/web/snapshots.py`
 - Modify: `tests/test_snapshot_ux.py` (1 new test)
 **Spec (3 sub-fixes):**
 1. **Hoist `datetime`/`timezone` imports** to module level (currently inside `_list_all_snapshots`).
 2. **Strict `kind` validation in restore/preview routes**: currently `kind` defaults to `"periodic"`. If a rewind snapshot is requested without explicit `kind`, the lookup silently 404s. Reject missing `kind` with a 400 instead of silently defaulting.
 3. **Document `created_at` mtime drift risk** in module docstring: snapshot timestamps come from file mtime, not the encoded filename timestamp. Files copied via `cp -p` preserve mtime; `cp` without `-p` resets it. Add a one-line note.
 **Test:** `test_restore_without_kind_returns_400` — POST `/snapshots/restore/<id>` without `kind`; assert 400.
 **Commit:** `chore: snapshots.py polish — hoisted imports + strict kind + mtime doc (T105)`.
 ---
 ### Task 106: search.py polish
 **Files:**
 - Modify: `chat/web/search.py`
 - Modify: `tests/test_search_ux.py` (1 new test)
 **Spec (2 sub-fixes):**
 1. **Hardcoded `k=50` → module constant**: extract `DEFAULT_SEARCH_K = 50` at module level. Tunable without code change at the call site.
 2. **N+1 lookup batching**: GET `/search?q=...` currently calls `get_bot(conn, owner_id)`, `get_chat(conn, chat_id)`, `get_scene(conn, scene_id)` per result row (worst case 50×3 = 150 individual queries). Batch via `WHERE id IN (...)` queries: collect distinct ids first, fetch in 3 batched queries, then map back per row.
 **Test:** `test_search_results_use_batched_lookups` — mock `get_bot`/`get_chat`/`get_scene` and assert each is called once (not per row). OR easier: time the search with 50 results and assert it doesn't degrade linearly with `k`.
 **Commit:** `perf: search.py N+1 batching + k constant extraction (T106)`.
 ---
 ### Task 107: embeddings.py timeout_s fallback-path logging
 **Files:**
 - Modify: `chat/services/embeddings.py`
 - Modify: `tests/test_embeddings.py` (1 new test)
 **Spec:**
 When `model != DEFAULT_EMBEDDING_MODEL` and falls through to fallback (zero-vector with model="fallback"), log a `warning` so misconfigured callers (e.g., a Phase 4.5+ caller pointing at a real model that doesn't exist) don't silently degrade.
 ```python
 if model != DEFAULT_EMBEDDING_MODEL:
    _log.warning(
        "generate_embedding: non-default model %r returned fallback "
        "(model client.embed() not yet implemented in Phase 4.5+); "
        "downstream search will degrade silently. Configure a supported model.",
        model,
    )
    return EmbeddingResult(...)  # fallback
 ```
 The Phase 4 default path (`model == DEFAULT_EMBEDDING_MODEL` → pseudo-embedding) is silent; only non-default models trigger the warning.
 **Test:** `test_generate_embedding_non_default_model_logs_warning` — call with `model="real-model"`; capture log via `caplog`; assert the warning message appears.
 **Commit:** `chore: embeddings.py warns on fallback for non-default models (T107)`.
 ---
 ### Task 108: scene-close-on-cancel UX revisit
 **Files:**
 - Modify: `tests/test_turn_flow.py` (extend the existing pin test added in Phase 2.5 T74.3 OR add a new one)
 - Optionally modify: `chat/web/turns.py` if a real bug surfaces during investigation
 **Spec:**
 This carry-over has been pending since Phase 2.5 T74.3. The pinned behavior: scene close fires even when the primary turn is cancelled mid-stream, because `detect_scene_close` consults user prose (fully present at cancel time), not bot output.
 **Action:**
 1. **Re-investigate** by reading the post_turn cancellation path. Confirm the rationale still holds (it should — nothing about the close-detection logic changed in Phase 3 or 4).
 2. **Strengthen the regression test** in `tests/test_turn_flow.py` (the existing `test_cancelled_turn_still_closes_scene_when_user_prose_signals_close`). Add an assertion that the user prose IS present at the moment scene_close_decision fires (even though the bot output isn't).
 3. If investigation surfaces an actual UX issue (e.g., the close fires too eagerly on prose like "fade out... actually wait"), this becomes a real fix — but default action is documentation-only.
 **Default outcome:** add a docstring comment to the post_turn close-detection branch explaining the rationale. No behavioral change.
 **Test (extend existing):** assert ordering — `scene_closed` event lands AFTER the user_turn event but BEFORE any potential assistant_turn (which is cancelled). Pin the contract.
 **Commit:** `chore: scene-close-on-cancel — strengthen regression test + document rationale (T108)`.
 ---
 ## Wave 2 — Schema migration (single)
 ### Task 109: 0014 schema migration
 **Files:**
 - Create: `chat/db/migrations/0014_phase45_schema.sql`
 - Modify: `chat/state/memory.py` or `chat/services/memory_write.py` (populate the new `event_id` column on memory_written)
 - Modify: `tests/test_world.py` (bump schema_version assertion to 14)
 - Modify: `tests/test_memory_write.py` (assert event_id populated)
 **Spec:**
 Two schema changes bundled into a single migration:
 1. **`embeddings.memory_id` FK gets `ON DELETE CASCADE`** (T88 review nit). SQLite doesn't support `ALTER TABLE ... ALTER COLUMN`, so the standard pattern is: rename old table, create new, copy data, drop old, recreate indices. Alternatively, since this is a new-ish table (Phase 4 added it) and the change is purely defensive, document as "WONTFIX in 4.5; deindex events remain the only deletion path; ON DELETE CASCADE remains a Phase 5 candidate when we do a broader migration cleanup". Choose pragmatically.
 2. **Add `memories.event_id INTEGER` column** (NULL allowed for backward compat) referencing `event_log.id`. This is the foundation for T111's deep-linking from cross-chat search results to specific turns. Migration adds the column; the projector for `memory_written` populates it from the event id when projecting.
 **Production code change:** in the `memory_written` projector handler (in `chat/state/memory.py` or wherever it lives), populate the new `event_id` column with the projecting event's `id`. The `Event` object has `id` available in the projector context.
 **Tests:**
 1. `test_schema_version_after_migration_is_14` (rename + bump from 13).
 2. `test_memory_written_populates_event_id` — append memory_written; project; query memories table; assert `event_id` is the projecting event's id.
 3. (Backward compat) older memories from existing seed data have NULL `event_id` — the column is nullable.
 **Commit:** `feat: 0014 schema — embeddings FK CASCADE (deferred or applied) + memories.event_id column (T109)`.
 ---
 ## Wave 3 — Drawer Phase 4.5 bundle (single)
 ### Task 110: drawer polish + bulk significance re-rate
 **Files:**
 - Modify: `chat/web/drawer.py`
 - Modify: `chat/templates/_drawer.html`
 - Create: `chat/templates/_delete_impact_modal.html` (extracted partial)
 - Modify: `chat/state/manual_edit.py` (potentially — if bulk re-rate emits a new manual_edit kind)
 - Modify: `tests/test_drawer_phase4.py` (extend with 4-5 new tests)
 **Spec (4 sub-fixes, 4 commits):**
 1. **`event_id <= 0` guard in `delete_turn`** (T98 nit): currently silently rewinds everything if `event_id` is 0. Add `if event_id <= 0: raise HTTPException(400, "...")`.
 2. **`html.escape()` on delete-impact modal** (T98 nit): the rendered HTML in `compute_delete_impact` output is built via raw f-strings from model-controlled strings. Wrap user-controllable fields with `html.escape()`. Defense-in-depth — currently safe, but if event payload fields ever appear in descriptions, autoescape would prevent XSS.
 3. **Extract delete-impact modal HTML to a Jinja partial**: create `chat/templates/_delete_impact_modal.html`; render via `templates.TemplateResponse(...)` instead of f-string concatenation. Inherits Jinja2 autoescape automatically. Tests use the existing TestClient pattern.
 4. **Bulk significance re-rate** (T98.2 deferral): drawer panel showing memory significance distribution per chat. New POST route `/chats/{chat_id}/drawer/memory/significance/bulk` accepting `{level_from, level_to}` form fields. Updates ALL memories in the chat at `level_from` to `level_to` via a sequence of `manual_edit` events (one per memory — preserves the audit trail).
 **Tests:**
 1. `test_delete_turn_with_event_id_zero_returns_400`.
 2. `test_delete_impact_modal_uses_jinja_partial` (assert response renders the partial template; verify with `assert b"<div class=\"delete-impact-modal\">" in response.content` or similar).
 3. `test_delete_impact_modal_escapes_user_controllable_strings` — seed an event with a payload containing `<script>` in a description-bound field; render preview; assert it appears HTML-escaped.
 4. `test_bulk_significance_re_rate_emits_manual_edit_per_memory` — seed 5 memories at significance 0; bulk re-rate to 2; assert 5 `manual_edit` events landed.
 **Commits (4):**
 - `fix: drawer delete_turn guards event_id <= 0 (T110.1)`
 - `fix: drawer delete-impact modal HTML escapes user-controllable fields (T110.2)`
 - `refactor: drawer delete-impact modal extracted to Jinja partial (T110.3)`
 - `feat: drawer bulk significance re-rate per chat (T110.4)`
 ---
 ## Wave 4 — Search UX enhancements (single)
 ### Task 111: FTS highlighting + deep-link to turn
 **Files:**
 - Modify: `chat/services/cross_chat_search.py`
 - Modify: `chat/web/search.py`
 - Modify: `chat/templates/search.html`
 - Modify: `tests/test_search_ux.py`
 **Spec (2 sub-fixes, 2 commits):**
 1. **FTS highlighting via `snippet()`** (T100 nit): replace the `pov_summary` column in `search_all_memories`'s SELECT with `snippet(memories_fts, 0, '<mark>', '</mark>', '…', 32)` to return a highlighted snippet around the match. The template renders this raw via `|safe` (the snippet is built by SQLite from indexed content; the `<mark>` tags are the only HTML, and SQLite escapes any HTML special chars in the source content).
 2. **Deep-link to turn via memories.event_id** (T100 nit + T109 dependency): now that `memories.event_id` exists (from T109), each search result row knows the originating event id. The chat page uses turn-id stamping (Phase 3.5 T86 added `id="turn-{event_id}"`). Build result links as `/chats/{chat_id}#turn-{event_id}`. The chat page DOM scrolls to the anchor on load (browser default).
 **Tests:**
 1. `test_search_results_include_fts_snippet_with_highlight` — seed memory with text containing "rabbit"; search for "rabbit"; assert response body contains `<mark>rabbit</mark>` (or whatever marker the snippet uses).
 2. `test_search_result_link_includes_turn_anchor` — seed memory with known event_id; search; assert link href contains `#turn-{event_id}`.
 **Commits (2):**
 - `feat: cross-chat search FTS snippet highlighting (T111.1)`
 - `feat: cross-chat search deep-links to turn via memories.event_id (T111.2)`
 ---
 ## Wave 5 — Real embedding model (single)
 ### Task 112: Real embedding model swap
 **Files:**
 - Modify: `chat/llm/client.py` (Protocol — add `embed(text, model) -> list[float]` method)
 - Modify: `chat/llm/featherless.py` (FeatherlessClient — implement `embed` against Featherless `/v1/embeddings` endpoint OR equivalent)
 - Modify: `chat/llm/mock.py` (MockLLMClient — accept canned embedding vectors)
 - Modify: `chat/services/embeddings.py` (route non-default model through `client.embed()`)
 - Modify: `chat/config.py` (add `embedding_model: str` setting; default to current pseudo)
 - Modify: `scripts/backfill_embeddings.py` (re-embed-all option for model swaps)
 - Modify: `tests/test_embeddings.py` + `tests/test_llm_mock.py` + `tests/test_featherless.py` (if exists)
 **Spec:**
 Phase 4 ships a deterministic SHA-256 pseudo-embedding (deterministic but semantically meaningless). T112 wires the path for a real embedding model.
 **Steps:**
 1. **Extend `LLMClient` Protocol** with `async def embed(self, text: str, *, model: str) -> list[float]`.
 2. **Implement on FeatherlessClient**: call the Featherless OpenAI-compatible `/v1/embeddings` endpoint:
   ```python
   response = await self._http.post(
       "/v1/embeddings",
       json={"model": model, "input": text},
       headers={"Authorization": f"Bearer {self._api_key}"},
   )
   data = response.json()
   return data["data"][0]["embedding"]
   ```
   Handle rate limits (existing 2-conn semaphore covers this).
 3. **Implement on MockLLMClient**: `embed` pops a canned vector from a new `canned_embeddings` queue. Tests configure this queue.
 4. **Update `generate_embedding`**: when `model != DEFAULT_EMBEDDING_MODEL`, call `client.embed(text, model=model)` instead of falling through to fallback. Wrap in try/except — failures fall back to zero vector (existing fallback path).
 5. **Settings**: add `embedding_model: str = "pseudo-sha256-384"` to `Settings`. App reads this at startup; the embedding worker (`chat/services/embedding_worker.py`) passes it through.
 6. **Backfill script**: add `--re-embed-all` flag that walks ALL memories (regardless of existing `embeddings_meta` rows) and re-embeds with the configured model. Useful for swapping models.
 **Tests:**
 1. `test_embed_routes_to_client_when_non_default_model` — mock client with canned vector; call `generate_embedding(model="bge-small-en-v1.5")`; assert vector matches the canned response.
 2. `test_embed_falls_back_on_client_failure` — mock client to raise; assert returns zero vector with model="fallback".
 3. `test_mock_llm_client_embed_pops_canned`.
 4. `test_featherless_embed_calls_correct_endpoint` (if there's an existing featherless test pattern; otherwise mock the HTTP layer).
 **Commits:**
 - `feat: LLMClient Protocol gains embed() method (T112.1)`
 - `feat: FeatherlessClient.embed() against /v1/embeddings (T112.2)`
 - `feat: generate_embedding routes non-default models through client.embed (T112.3)`
 - `feat: backfill_embeddings --re-embed-all flag for model swaps (T112.4)`
 ---
 ## Wave 6 — Branching read-side filter (single, BIG)
 ### Task 113: Branching read-side filter
 **Files (cross-cutting):**
 - Modify: `chat/services/turn_common.py::read_recent_dialogue` — filter events to active branch's range
 - Modify: `chat/services/scene_summarize.py::_read_recent_dialogue` (similar)
 - Modify: `chat/state/memory.py::search_memories` — memories should be filtered to active branch (memories.event_id from T109 enables this)
 - Modify: `chat/state/branches.py` — add helper `active_branch_event_ids(conn) -> tuple[int, int]` returning (origin, head)
 - Add tests across multiple files
 - Modify: `tests/test_branching.py` — add cross-feature tests
 **Spec:**
 Phase 4 T89 + T94 shipped branching as metadata-only (the table tracks branches; the drawer UI can switch). But event readers DON'T consult `is_active` — they read the entire event_log. So switching branches has no functional effect.
 T113 wires the filter:
 1. **Helper** `active_branch_event_ids(conn) -> tuple[int, int]`: returns `(origin_event_id, head_event_id)` for the currently active branch. For "main" with origin=0 + head=N, returns `(0, N)` meaning "all events visible".
 2. **Apply filter** in every event reader that returns historical state:
   - `read_recent_dialogue`: WHERE clause adds `id BETWEEN ? AND ?` (the active branch's range).
   - `search_memories`: WHERE clause adds `m.event_id BETWEEN ? AND ?` (uses T109's column).
   - `scene_summarize._read_recent_dialogue`: same as turn_common.
   - Other readers TBD — grep for `event_log` SELECT patterns and audit each one.
 3. **Branches that diverge**: when branch B is created from event 10 and then accumulates events 11-15 (which only exist on B's timeline), but main also accumulates 11-12, the events overlap by id range. This is OK because event reads filter by `id <= active_branch.head_event_id`. The simpler model: branches share event_log ids globally, but each branch's "head" defines which ids are visible.
 4. **Events written under branch B** carry an implicit branch tag — but the event_log table has no `branch_id` column today. T113 punts on cross-branch event writes (they all land in the global log) and relies on the `head_event_id` filter to scope reads. This is a Phase 4.5+ first cut; full branch-isolated event_log is Phase 5+.
 **Edge cases:**
 - Active branch has `head_event_id = 0` (just created): readers return empty.
 - No active branch: readers fall through to "all events visible" (defensive).
 - Switching branches mid-flight: each `read_recent_dialogue` call re-queries `active_branch`, so it's always current. No caching.
 **Tests:** 5+ minimum.
 1. `test_read_recent_dialogue_respects_active_branch_head` — seed 10 events; active branch head = 5; assert only first 5 returned.
 2. `test_search_memories_respects_active_branch_head` — same.
 3. `test_branch_switch_changes_visible_events` — switch branches; immediately read; assert different result sets.
 4. `test_main_branch_with_head_zero_returns_empty` — defensive.
 5. `test_no_active_branch_falls_through_to_all_events` — defensive.
 **Commit:** `feat: branching read-side filter — event readers consult active branch range (T113)`.
 **This is the largest task in Phase 4.5.** Estimate 200-400 lines across multiple files. Implementer should split commits if it helps clarity (one per affected reader).
 ---
 ## Wave 7 — Lifecycle rollback in regenerate (single)
 ### Task 114: Lifecycle rollback
 **Files:**
 - Modify: `chat/services/regenerate.py`
 - Modify: `chat/db/migrations/0014_phase45_schema.sql` (T109's migration) — add column? OR
 - Add new migration — see decision below
 - Modify: tests in `tests/test_regenerate.py`
 **Spec:**
 Phase 3.5 T83.4 shipped a warning log when regenerate detects un-rolled-back lifecycle transitions. T114 implements actual rollback.
 **Schema decision:**
 Option A: extend lifecycle event payloads with `triggered_by_assistant_turn_id` (no schema change needed — just a payload convention). Production code (T61 turn flow) populates it when emitting `event_started`/`event_completed`/`event_cancelled`. Existing rows have NULL — rollback skips them with a debug log.
 Option B: add a column to `event_log` for stronger invariants. Significant migration cost.
 **Recommended:** Option A. Safer, no migration, backward compatible (older events skip rollback). Document in commit body.
 **Rollback semantics:**
 When regenerate detects lifecycle events triggered by the superseded turn:
 - `event_started` → emit `event_cancelled` (or a NEW `event_started_undone` event kind that reverts status to "planned") with the same event_id.
 - `event_completed` → emit `event_uncompleted` (NEW event kind that reverts status from "completed" to "active").
 - `event_cancelled` → emit `event_uncancelled` (reverts to prior status — which we'd need to track; or simpler: emit `event_started` again to restore "active").
 **Simpler approach (recommended):** add ONE new event kind `event_status_reverted` with payload `{event_id, prior_status}`. The projector sets `events.status = prior_status` for the event_id. Rollback emits this event for each affected lifecycle transition, looking up the prior status from the row's history (via event_log scan) or accepting it as a payload field.
 **Production code change:** in `chat/web/turns.py::post_turn` (and `chat/services/regenerate.py`), when emitting `event_started`/`event_completed`/`event_cancelled`, populate `triggered_by_assistant_turn_id: <id>` in the payload. Forward-only — older code doesn't need updating.
 **Tests:** 3 minimum.
 1. `test_regenerate_rolls_back_event_started_from_superseded_turn` — seed an event; play a turn that starts it; regenerate; assert `event_status_reverted` event landed with `prior_status="planned"` and the events row is back to "planned".
 2. `test_regenerate_rolls_back_event_completed_to_active` — same but completed → active rollback.
 3. `test_regenerate_skips_events_without_back_reference` — older events without `triggered_by_assistant_turn_id` are not rolled back (debug log). Pin the backward-compat behavior.
 **Commits:**
 - `feat: lifecycle events carry triggered_by_assistant_turn_id back-reference (T114.1)`
 - `feat: event_status_reverted event kind + projector handler (T114.2)`
 - `feat: regenerate rolls back lifecycle transitions on supersede (T114.3)`
 ---
 ## Wave 8 — sqlite-vec swap (single, ENVIRONMENTAL)
 ### Task 115: sqlite-vec swap (optional)
 **Files:**
 - Create: `chat/db/migrations/0015_vec0_virtual_tables.sql`
 - Modify: `chat/db/connection.py` (load extension on every connection)
 - Modify: `chat/services/vector_search.py` (rewrite to use vec0 MATCH instead of pure-Python cosine)
 - Modify: `chat/state/embeddings.py` (writer needs to populate vec0 table)
 - Modify: `pyproject.toml` (add `sqlite-vec` dependency)
 **Pre-flight:**
 This task REQUIRES one of:
 - Python rebuilt with `--enable-loadable-sqlite-extensions` (pyenv reinstall).
 - `apsw` migration of `chat/db/connection.py`.
 If neither is feasible at the time of execution: SKIP THIS TASK and document the deferral in T118 docs sweep. The other 13 Phase 4.5 tasks ship without it.
 **Spec:**
 1. **Migration** `0015_vec0_virtual_tables.sql`:
   ```sql
   CREATE VIRTUAL TABLE embeddings_vec USING vec0(
       memory_id INTEGER PRIMARY KEY,
       embedding FLOAT[384]
   );
   -- Backfill from existing JSON embeddings table.
   INSERT INTO embeddings_vec (memory_id, embedding)
   SELECT memory_id, vec_f32(vector_json) FROM embeddings;
   ```
 2. **`chat/db/connection.py`** loads `sqlite_vec` extension on every connection:
   ```python
   import sqlite_vec
   def open_db(...):
       conn = sqlite3.connect(...)
       conn.enable_load_extension(True)
       sqlite_vec.load(conn)
       conn.enable_load_extension(False)
       ...
   ```
 3. **Rewrite `vector_search.py`** to use `embeddings_vec MATCH ?` syntax with `k=?` clause:
   ```sql
   SELECT m.id, m.pov_summary, m.significance, e.distance
   FROM embeddings_vec e
   JOIN memories m ON m.id = e.memory_id
   WHERE e.embedding MATCH ? AND k = ?
     AND m.owner_id = ?
     AND m.witness_<role> = 1
   ORDER BY e.distance ASC
   LIMIT ?
   ```
 4. **HNSW note**: vec0 supports both flat (default) and HNSW indexes. T115 ships flat (sufficient for < few thousand memories). Document HNSW upgrade path in CLAUDE.md if memory counts ever grow past pure-Python feasibility.
 5. **Old `embeddings` JSON table**: keep alongside `embeddings_vec` (data redundancy is fine; the JSON table is the source of truth and `embeddings_vec` is the index). Backfill on migration. Keep the `embedding_indexed` projector populating both.
 **Tests:** rewrite `tests/test_vector_search.py` to expect new behavior. Same observable contract — only implementation changes. All 5 existing tests should pass post-swap.
 **Commit:** `feat: sqlite-vec swap (vec0 virtual tables + MATCH-based search) (T115)`.
 ---
 ## Wave 9 — Polish (parallel, 3 tasks)
 ### Task 116: Structured test-fixture builder
 **Files:**
 - Create: `tests/fixtures.py` (or extend `tests/conftest.py`)
 - Modify: existing test files that use brittle canned-queue arrays (selectively)
 **Spec:**
 Phase 3 carry-over. Tests across `test_turn_flow.py`, `test_meanwhile_turn_flow.py`, `test_phase3_integration.py`, `test_phase4_integration.py` use positional canned-response arrays for `MockLLMClient`. Adding a new classifier call to a code path requires updating canned arrays in many tests.
 **Solution:** structured fixture builder that lets tests declare their classifier expectations by name, not position:
 ```python
 # tests/fixtures.py
 class CannedQueue:
    def __init__(self):
        self._queue = []
    def parse_turn(self, **fields): ...
    def state_update(self, **fields): ...
    def detect_scene_close(self, should_close: bool): ...
    def detect_event_transitions(self, transitions: list[dict]): ...
    def summarize_scene(self, summary: str, **fields): ...
    def detect_threads(self, candidates: list[dict]): ...
    # ... one method per classifier service
    def build(self) -> list[str]:
        return [json.dumps(item) for item in self._queue]
 ```
 Usage:
 ```python
 def test_post_turn_with_event_transition(...):
    canned = (
        CannedQueue()
            .parse_turn(intent="narrative")
            .narrative("BotA speaks.")  # narrative is a stream, but for simplicity treat it like a canned response
            .state_update(affinity_delta=0, trust_delta=0)
            .state_update(affinity_delta=0, trust_delta=0)
            .detect_event_transitions([{"event_id": "evt_1", "new_status": "completed"}])
            .detect_scene_close(should_close=False)
            .build()
    )
    mock = MockLLMClient(canned=canned)
    # ...
 ```
 **Migration scope:** don't migrate ALL existing tests at once — that's a separate massive refactor. Instead, ship the fixture builder + migrate 2-3 representative tests as proof of concept. Document the migration path in the fixture's docstring.
 **Tests:** the fixture builder itself doesn't need extensive testing — it's just a builder. Add 1-2 sanity tests that the JSON output matches expected shapes.
 **Commit:** `test: structured CannedQueue fixture builder for classifier mocks (T116)`.
 ---
 ### Task 117: Phase 4.5 cross-feature integration tests
 **Files:**
 - Create: `tests/test_phase45_integration.py`
 **Spec:**
 End-to-end multi-feature flows specific to Phase 4.5 changes. 5 tests minimum.
 1. **Real embedding swap + retrieval** — configure `embedding_model="bge-small-en-v1.5"` (mocked); write a memory; backfill or wait for worker; assert vector search returns the memory via `client.embed`-derived vector (not pseudo).
 2. **Branching read-side filter end-to-end** — create a branch from turn 5; switch; play 3 turns on the branch; switch back to main; assert main's recent dialogue is missing the branch turns (read filter respects active branch's head).
 3. **Lifecycle rollback** — start an event via a turn; regenerate that turn; assert lifecycle reverted (event back to "planned").
 4. **Search deep-link** — write memories; search; click a result; verify the chat page renders with the right turn anchored (assert via TestClient response — either the browser anchor OR a server-side scroll-to-anchor mechanism).
 5. **Bulk significance re-rate end-to-end** — seed 5 memories at significance 0; bulk re-rate via drawer; verify significance histogram updates.
 **Commit:** `test: phase 4.5 cross-feature integration coverage (T117)`.
 ---
 ### Task 118: Phase 4.5 documentation update
 **Files:**
 - Modify: `CLAUDE.md`
 - Modify: `docs/plans/2026-04-26-v1-requirements-design.md` (annotate §13 Phase 4 entries — though they're already shipped per Phase 4 T102)
 **Spec:**
 Mirror the Phase 3.5 / 2.5 status sections. Document:
 - All shipped items per task (T103–T117).
 - Empty out the Phase 4.5 / 5 backlog (replace with single "All items shipped" line).
 - Add new "Phase 5 backlog" section if any Phase 4.5 reviews surfaced new follow-ups.
 **Phase 5 backlog candidates** (default, if no new follow-ups discovered):
 - Vector index optimization (HNSW) when memory counts grow past flat-index feasibility.
 - Branch-isolated event_log (each branch has its own physical event_log range vs the current shared id space + head filter).
 - Embedding model swap migration tooling — when changing models, need to re-embed everything; T112 added `--re-embed-all` but a more orchestrated swap (drain old worker, re-seed all memories, swap config) is Phase 5+.
 - Real-time collaborative branching (multi-user) — out of scope for v1.
 - Avatars / portraits (multimodality) — deferred indefinitely per design §14.
 **Commit:** `docs: phase 4.5 status, prune backlog, capture phase 5 candidates (T118)`.
 ---
 ## Wrap-up
 After Wave 9 lands:
 1. **Run full suite** on `phase-4.5`: should be ~430+ tests passing (413 from Phase 4 + ~20 new across Phase 4.5).
 2. **Manual smoke** (recommended before opening the PR):
   - Configure `embedding_model="bge-small-en-v1.5"` (or whatever real model is chosen); restart server; play a turn; verify `embedding_indexed` events use the real model and search returns semantically-relevant memories.
   - Create a branch, switch, play turns, switch back — verify main's history is unaffected.
   - Plan an event, complete it via a turn, regenerate that turn — verify event reverts to "planned".
   - Use cross-chat search; click a result; verify it lands on the right turn in the chat page.
   - Bulk re-rate a chat's significance distribution.
 3. **Push `phase-4.5`** to gitea.
 4. **Open PR** `phase-4.5 → main`.
 ---
 ## Notes for the controller running this plan
 - **T115 (sqlite-vec swap)** is environmental. If pre-flight fails (no rebuilt Python, no apsw), defer to Phase 5 and ship Phase 4.5 with 13 tasks. T118 docs sweep should note the deferral.
 - **T112 (real embedding swap)** assumes Featherless or similar exposes an `/v1/embeddings` endpoint. If not available, document the gap and ship the Protocol + Mock impl only (Featherless impl deferred). The pseudo path remains the default in that case — same as Phase 4.
 - **T113 (branching read-side filter)** is the riskiest task. Cross-cutting. Land it on a quiet branch, test thoroughly. If integration tests break in unexpected ways, bisect the affected reader and add coverage.
 - **After each parallel wave**, run a code-review subagent. Combined spec+quality acceptable for trivial tasks (T103–T108); separate spec + quality reviewers for big tasks (T112, T113, T114, T115).
 - **Token-spend rough estimate**: Phase 4.5 should be ~50% the size of Phase 4 (similar number of tasks, mostly smaller). Big tasks (T112, T113, T114) bring the per-task spend up but parallelism in Wave 1 + Wave 9 brings the wall-clock down.
 - **DO NOT break existing v1/v2/v3/v3.5/v4 surface contracts.** Every test file that was green at the start of Phase 4.5 must stay green at the end. The cross-feature integration tests (`tests/test_phase4_integration.py`, `tests/test_phase3_integration.py`) are particularly load-bearing.
@@ -0,0 +1,23 @@
 {
  "planPath": "docs/plans/2026-04-27-v4.5-phase4.5-cleanup.md",
  "tasks": [
    {"id": 103, "subject": "T103: branches polish (global-leak doc + branch-switch warning)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
    {"id": 104, "subject": "T104: state/memory.py polish (DRY MAX(id) + fts_rank doc)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
    {"id": 105, "subject": "T105: snapshots.py polish (datetime hoist + kind validation + mtime doc)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
    {"id": 106, "subject": "T106: search.py polish (k constant + N+1 batched lookups)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
    {"id": 107, "subject": "T107: embeddings.py timeout_s fallback-path logging", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
    {"id": 108, "subject": "T108: scene-close-on-cancel UX revisit (regression test pin + rationale doc)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
    {"id": 109, "subject": "T109: 0014 schema migration (FK CASCADE + memories.event_id column)", "status": "pending", "wave": 2, "parallelGroup": null},
    {"id": 110, "subject": "T110: drawer Phase 4.5 bundle (event_id guard + html.escape + modal partial + bulk sig re-rate)", "status": "pending", "wave": 3, "parallelGroup": null, "blockedBy": [109]},
    {"id": 111, "subject": "T111: search UX (FTS snippet highlighting + deep-link via memories.event_id)", "status": "pending", "wave": 4, "parallelGroup": null, "blockedBy": [109]},
    {"id": 112, "subject": "T112: real embedding model swap (LLMClient.embed protocol + Featherless impl + routing)", "status": "pending", "wave": 5, "parallelGroup": null},
    {"id": 113, "subject": "T113: branching read-side filter (event readers consult is_active branch range)", "status": "pending", "wave": 6, "parallelGroup": null, "blockedBy": [109]},
    {"id": 114, "subject": "T114: regenerate lifecycle rollback (back-reference + event_status_reverted)", "status": "pending", "wave": 7, "parallelGroup": null},
    {"id": 115, "subject": "T115: sqlite-vec swap (vec0 virtual tables + MATCH search) [ENVIRONMENTAL — may defer]", "status": "pending", "wave": 8, "parallelGroup": null},
    {"id": 116, "subject": "T116: structured CannedQueue test fixture builder", "status": "pending", "wave": 9, "parallelGroup": "wave-9"},
    {"id": 117, "subject": "T117: phase 4.5 cross-feature integration tests", "status": "pending", "wave": 9, "parallelGroup": "wave-9", "blockedBy": [110, 111, 112, 113, 114]},
    {"id": 118, "subject": "T118: phase 4.5 docs sweep — prune backlog, capture phase 5 candidates", "status": "pending", "wave": 9, "parallelGroup": "wave-9", "blockedBy": [110, 111, 112, 113, 114]}
  ],
  "lastUpdated": "2026-04-27T00:00:00Z",
  "notes": "16 tasks across 9 waves consolidating all 24 items in CLAUDE.md Phase 4.5/5 backlog. Wave 1 (6-way parallel) and Wave 9 (3-way parallel) maximize parallelism. Waves 2-8 are single-task by hot-file constraint. T115 (sqlite-vec swap) requires Python rebuild OR apsw migration — environmental; may defer to Phase 5. Schema baseline 13 -> 14 (T109's 0014) -> optionally 15 (T115's 0015). Big tasks: T112 (real embedding swap), T113 (branching read-side filter — riskiest), T114 (lifecycle rollback). Uses task ids T103-T118."
 }
@@ -8,8 +8,21 @@ Phase 4 ships the deterministic local pseudo-embedding so this script
 runs synchronously without a network round-trip — the LLMClient argument
 is not needed on the pseudo path. Phase 4.5+ will need a real client.
 T112 (Phase 4.5) adds two flags:
 * ``--re-embed-all`` walks **every** memory regardless of whether it
  already has an ``embeddings`` row. Useful when swapping embedding
  models — the projector is INSERT OR REPLACE, so re-emitting an event
  for an existing memory replaces the prior vector. Without this flag,
  the script keeps the Phase 4 behavior of only filling in gaps.
 * ``--model M`` overrides ``Settings.embedding_model`` for this run.
  Defaults to the configured model (which itself defaults to
  ``"pseudo-sha256-384"``).
 Run from the repo root:
    .venv/bin/python scripts/backfill_embeddings.py [--limit N] [--dry-run]
    .venv/bin/python scripts/backfill_embeddings.py --re-embed-all
    .venv/bin/python scripts/backfill_embeddings.py --re-embed-all --model bge-small-en-v1.5
 """
 from __future__ import annotations
@@ -17,11 +30,12 @@ from __future__ import annotations
 import argparse
 import asyncio
-from chat.config import load_settings
+from chat.config import Settings, load_settings
 from chat.db.connection import open_db
 from chat.db.migrate import apply_migrations
 from chat.eventlog.log import append_and_apply
 from chat.services.embeddings import (
    DEFAULT_EMBEDDING_MODEL,
    FALLBACK_EMBEDDING_MODEL,
    generate_embedding,
 )
@@ -34,6 +48,24 @@ import chat.state.memory  # noqa: F401
 import chat.state.world  # noqa: F401
 def _build_client(settings: Settings):
    """Construct an LLMClient for the backfill run.
    Default-model runs (the pseudo path) don't need a client, so we
    return ``None`` and ``generate_embedding`` skips the call. Non-default
    models route through the real client; injectable via monkeypatch in
    tests.
    """
    if settings.embedding_model == DEFAULT_EMBEDDING_MODEL:
        return None
    from chat.llm.featherless import FeatherlessClient
    return FeatherlessClient(
        api_key=settings.featherless_api_key,
        base_url=settings.featherless_base_url,
    )
 async def main() -> None:
    parser = argparse.ArgumentParser(description=__doc__)
    parser.add_argument(
@@ -47,23 +79,51 @@ async def main() -> None:
        action="store_true",
        help="Print the count of memories needing embeddings, then exit.",
    )
    parser.add_argument(
        "--re-embed-all",
        action="store_true",
        help=(
            "Walk every memory (not just those without an embeddings row) "
            "and re-emit embedding_indexed events. Use this when swapping "
            "embedding models so the existing rows get replaced."
        ),
    )
    parser.add_argument(
        "--model",
        type=str,
        default=None,
        help=(
            "Embedding model identifier. Overrides Settings.embedding_model "
            "for this run; default uses the configured model."
        ),
    )
    args = parser.parse_args()
    settings = load_settings()
    settings.db_path.parent.mkdir(parents=True, exist_ok=True)
    apply_migrations(settings.db_path)
    model = args.model or settings.embedding_model
    # Override the settings instance so ``_build_client`` sees the
    # effective model when deciding whether to construct a real client.
    settings = settings.model_copy(update={"embedding_model": model})
    client = _build_client(settings)
    with open_db(settings.db_path) as conn:
-        sql = (
+        if args.re_embed_all:
-            "SELECT m.id, m.pov_summary FROM memories m "
+            sql = "SELECT m.id, m.pov_summary FROM memories m ORDER BY m.id"
-            "LEFT JOIN embeddings e ON e.memory_id = m.id "
+        else:
-            "WHERE e.memory_id IS NULL "
+            sql = (
-            "ORDER BY m.id"
+                "SELECT m.id, m.pov_summary FROM memories m "
-        )
+                "LEFT JOIN embeddings e ON e.memory_id = m.id "
                "WHERE e.memory_id IS NULL "
                "ORDER BY m.id"
            )
        if args.limit is not None:
            sql += f" LIMIT {int(args.limit)}"
        rows = conn.execute(sql).fetchall()
-        print(f"Found {len(rows)} memories needing embeddings.")
+        mode = "re-embedding" if args.re_embed_all else "needing embeddings"
        print(f"Found {len(rows)} memories {mode} (model={model}).")
        if args.dry_run:
            return
@@ -71,11 +131,12 @@ async def main() -> None:
        skipped = 0
        for memory_id, text in rows:
            result = await generate_embedding(
-                client=None,  # pseudo path: no client needed
+                client=client,
                text=text or "",
                model=model,
            )
            if result.model == FALLBACK_EMBEDDING_MODEL:
-                print(f"  Skipping memory_id={memory_id} (empty text)")
+                print(f"  Skipping memory_id={memory_id} (empty text or fallback)")
                skipped += 1
                continue
            append_and_apply(
@@ -0,0 +1,383 @@
 """Structured test-fixture builder for ``MockLLMClient`` canned queues.
 Phase 4.5 (T116) carry-over from Phase 3. The turn-flow tests in
 ``test_turn_flow.py``, ``test_meanwhile_turn_flow.py``,
 ``test_phase3_integration.py``, and ``test_phase4_integration.py`` used
 to construct ``MockLLMClient`` canned-response queues as raw positional
 lists of pre-encoded JSON strings. That worked, but every time a new
 classifier call landed in a code path the tests had to be patched in
 many places at the right index — easy to mis-position, hard to read.
 This module ships :class:`CannedQueue`, a fluent builder that lets a
 test declare its classifier expectations by **name** and **order** of
 call, not by index into a brittle list. Each method appends one item
 to the queue and returns ``self`` for chaining; ``build()`` JSON-encodes
 the items and produces the flat ``list[str]`` that
 ``MockLLMClient(canned=...)`` expects.
 Usage
 -----
 >>> from tests.fixtures import CannedQueue
 >>> from chat.llm.mock import MockLLMClient
 >>> canned = (
 ...     CannedQueue()
 ...         .parse_turn(segments=[{"kind": "dialogue", "text": "hello"}])
 ...         .narrative("Hi there.")
 ...         .state_update()
 ...         .state_update()
 ...         .build()
 ... )
 >>> mock = MockLLMClient(canned=canned)
 Each method maps to a single classifier (or stream) call that the turn
 flow makes, in the order the production code makes them. Picking the
 right method for the slot you need keeps the test readable and lets the
 builder pin sensible defaults for the fields tests don't care about.
 Migration template
 ------------------
 To migrate a positional canned-array test:
 1. Identify each slot in the existing array and what classifier it
   feeds. Comments above the array often spell this out — start there.
 2. Replace each slot with the matching :class:`CannedQueue` method:
   - ``json.dumps({"segments": [...]})`` → ``.parse_turn(segments=...)``
   - bare narrative string → ``.narrative("...")``
   - zero-state JSON  → ``.state_update()`` (defaults are zeros)
   - ``json.dumps({"addressee_id": ...})`` → ``.detect_addressee(...)``
   - ``json.dumps({"should_interject": ...})`` → ``.detect_interjection(...)``
   - ``json.dumps({"should_close": ...})`` → ``.detect_scene_close(...)``
   - ``json.dumps({"transitions": [...]})`` → ``.detect_event_transitions(...)``
   - per-POV summary JSON → ``.summarize_scene_pov(summary=...)``
 3. End with ``.build()`` and pass that to
   ``MockLLMClient(canned=...)``. The mock's contract is unchanged.
 Notes on streams
 ----------------
 ``MockLLMClient.stream`` and ``MockLLMClient.generate`` share one queue
 — each pop is one entry, regardless of whether the production code
 streams the response or generates it whole. The narrative service
 streams; classifier services generate. The builder treats both the same:
 ``narrative()`` appends a raw string, the classifier methods append
 JSON-encoded dicts. Both end up in the same flat ``list[str]`` that the
 mock pops from in order.
 The remaining tests in the suite (about 30 across the four files
 mentioned above) still use positional arrays — Phase 5 work to migrate
 the rest. New tests should prefer this builder.
 """
 from __future__ import annotations
 import json
 from typing import Any
 class CannedQueue:
    """Fluent builder for ``MockLLMClient`` canned-response queues.
    Each method appends one item to an internal queue and returns
    ``self`` for chaining. ``build()`` returns the flat ``list[str]``
    suitable for ``MockLLMClient(canned=...)``.
    The queue holds either ``dict`` (JSON-encoded at ``build()`` time)
    or ``str`` (passed through verbatim — used for narrative streams).
    """
    def __init__(self) -> None:
        self._queue: list[Any] = []
    # ------------------------------------------------------------------
    # Narrative stream — bare string, no JSON wrapping.
    # ------------------------------------------------------------------
    def narrative(self, text: str) -> "CannedQueue":
        """Append one streaming narrative response.
        ``MockLLMClient.stream`` pops the next entry from the same queue
        as ``generate`` — a bare string is what the streaming bot beat
        consumes. Use one ``narrative()`` per assistant beat (primary,
        and optionally an interjection / second beat).
        """
        self._queue.append(text)
        return self
    def raw(self, value: str) -> "CannedQueue":
        """Append a raw string (escape hatch for non-classifier calls).
        Most tests should reach for the named helpers — this is here
        for one-offs the builder doesn't model yet.
        """
        self._queue.append(value)
        return self
    # ------------------------------------------------------------------
    # Turn parser — splits user prose into segments.
    # ------------------------------------------------------------------
    def parse_turn(
        self,
        *,
        segments: list[dict] | None = None,
        intent: str = "narrative",
        landing_state_hint: str = "",
        **rest: Any,
    ) -> "CannedQueue":
        """Append one ``parse_turn`` classifier response.
        ``intent`` defaults to ``"narrative"``; pass ``"skip_elision"``
        or ``"skip_jump"`` to exercise the natural-language skip paths.
        ``landing_state_hint`` carries the residual descriptor for
        elision skips and is otherwise ignored.
        """
        payload: dict[str, Any] = {
            "segments": segments if segments is not None else [],
            "intent": intent,
            "landing_state_hint": landing_state_hint,
        }
        payload.update(rest)
        self._queue.append(payload)
        return self
    # ------------------------------------------------------------------
    # Multi-entity addressee classifier (T74.1).
    # ------------------------------------------------------------------
    def detect_addressee(
        self,
        *,
        addressee_id: str,
        confidence: str = "medium",
        reason: str = "",
        **rest: Any,
    ) -> "CannedQueue":
        """Append one ``detect_addressee`` classifier response."""
        payload: dict[str, Any] = {
            "addressee_id": addressee_id,
            "confidence": confidence,
            "reason": reason,
        }
        payload.update(rest)
        self._queue.append(payload)
        return self
    # ------------------------------------------------------------------
    # State-update — one per directed edge per turn.
    # ------------------------------------------------------------------
    def state_update(
        self,
        *,
        affinity_delta: int = 0,
        trust_delta: int = 0,
        knowledge_facts: list | None = None,
        **rest: Any,
    ) -> "CannedQueue":
        """Append one ``apply_state_update`` classifier response.
        Defaults to a benign zero-delta payload — tests that don't care
        about state mutations can call this without arguments. One call
        is required per directed edge that fires after the assistant
        beat (e.g. single-bot non-guest turn = 2 calls; multi-bot guest
        turn = 6 calls).
        """
        payload: dict[str, Any] = {
            "affinity_delta": affinity_delta,
            "trust_delta": trust_delta,
            "knowledge_facts": (
                knowledge_facts if knowledge_facts is not None else []
            ),
        }
        payload.update(rest)
        self._queue.append(payload)
        return self
    def zero_state(self) -> "CannedQueue":
        """Alias for ``state_update()`` with all defaults — matches the
        ``_zero_state()`` helper in existing tests.
        """
        return self.state_update()
    # ------------------------------------------------------------------
    # Interjection (T74.2) — silent witness chimes in.
    # ------------------------------------------------------------------
    def detect_interjection(
        self,
        *,
        should_interject: bool,
        reason: str = "",
        **rest: Any,
    ) -> "CannedQueue":
        """Append one ``detect_interjection`` classifier response."""
        payload: dict[str, Any] = {
            "should_interject": should_interject,
            "reason": reason,
        }
        payload.update(rest)
        self._queue.append(payload)
        return self
    def detect_interjection_targeted(
        self,
        *,
        targeted: bool,
        target_id: str | None = None,
        reason: str = "",
        **rest: Any,
    ) -> "CannedQueue":
        """Append one targeted-interjection classifier response."""
        payload: dict[str, Any] = {
            "targeted": targeted,
            "target_id": target_id,
            "reason": reason,
        }
        payload.update(rest)
        self._queue.append(payload)
        return self
    # ------------------------------------------------------------------
    # Scene-close detector (T26).
    # ------------------------------------------------------------------
    def detect_scene_close(
        self,
        *,
        should_close: bool,
        reason: str = "",
        **rest: Any,
    ) -> "CannedQueue":
        """Append one ``detect_scene_close`` classifier response."""
        payload: dict[str, Any] = {
            "should_close": should_close,
            "reason": reason,
        }
        payload.update(rest)
        self._queue.append(payload)
        return self
    # ------------------------------------------------------------------
    # Event lifecycle (T52, T61) — per-turn transitions.
    # ------------------------------------------------------------------
    def detect_event_transitions(
        self,
        transitions: list[dict] | None = None,
    ) -> "CannedQueue":
        """Append one ``detect_event_transitions`` classifier response.
        ``transitions`` is a list of ``{"event_id": ..., "new_status":
        "active"|"completed"|"cancelled", "reason": ...}`` dicts. Pass
        an empty list (or omit the argument) to assert that the call
        ran but produced no transitions; pass ``None`` for an empty
        list with the same shape.
        Note: when no events are seeded, ``detect_event_transitions``
        short-circuits without an LLM call — in that case do NOT append
        this slot.
        """
        payload = {"transitions": transitions if transitions is not None else []}
        self._queue.append(payload)
        return self
    # ------------------------------------------------------------------
    # Per-POV scene summary (used after scene close).
    # ------------------------------------------------------------------
    def summarize_scene_pov(
        self,
        *,
        summary: str,
        knowledge_facts: list | None = None,
        relationship_summary: str = "",
        **rest: Any,
    ) -> "CannedQueue":
        """Append one per-POV scene-summary response.
        Used by ``apply_scene_close_summary`` — one call per witness
        once a scene closes.
        """
        payload: dict[str, Any] = {
            "summary": summary,
            "knowledge_facts": (
                knowledge_facts if knowledge_facts is not None else []
            ),
            "relationship_summary": relationship_summary,
        }
        payload.update(rest)
        self._queue.append(payload)
        return self
    # ------------------------------------------------------------------
    # Thread detection (Phase 3 §3.3).
    # ------------------------------------------------------------------
    def detect_threads(
        self,
        candidates: list[dict] | None = None,
    ) -> "CannedQueue":
        """Append one ``detect_threads`` classifier response.
        ``candidates`` is a list of ``{"action": "open"|"update",
        "title": ..., "summary": ..., "existing_thread_id": ...}`` dicts.
        """
        payload = {"candidates": candidates if candidates is not None else []}
        self._queue.append(payload)
        return self
    # ------------------------------------------------------------------
    # Meanwhile digest — narrative summary of what happened off-screen.
    # ------------------------------------------------------------------
    def meanwhile_digest(self, summary: str) -> "CannedQueue":
        """Append one meanwhile-digest narrative response.
        The digest service streams the digest as plain text (not JSON)
        so this is a thin wrapper over ``narrative``/``raw`` for
        readability at the call site.
        """
        self._queue.append(summary)
        return self
    # ------------------------------------------------------------------
    # Significance scorer (background worker; rarely hit in unit tests
    # but available for completeness).
    # ------------------------------------------------------------------
    def score_significance(
        self,
        *,
        score: float = 0.0,
        reason: str = "",
        **rest: Any,
    ) -> "CannedQueue":
        """Append one significance-scoring classifier response."""
        payload: dict[str, Any] = {"score": score, "reason": reason}
        payload.update(rest)
        self._queue.append(payload)
        return self
    # ------------------------------------------------------------------
    # Build / introspection.
    # ------------------------------------------------------------------
    def build(self) -> list[str]:
        """Return the flat ``list[str]`` queue for ``MockLLMClient``.
        Dict items are JSON-encoded; string items are passed through
        verbatim (so streaming responses retain their raw form).
        """
        out: list[str] = []
        for item in self._queue:
            if isinstance(item, str):
                out.append(item)
            else:
                out.append(json.dumps(item))
        return out
    def __len__(self) -> int:
        return len(self._queue)
@@ -0,0 +1,231 @@
 """Tests for the backfill_embeddings script (T112, Phase 4.5).
 Phase 4 shipped a backfill that walked memories *without* an embedding
 row and produced a vector for each (deterministic pseudo path). T112
 adds a ``--re-embed-all`` flag that walks **every** memory regardless
 of whether it already has an embeddings row, so operators can swap
 embedding models and have the existing rows replaced (the
 ``embedding_indexed`` projector is INSERT OR REPLACE).
 These tests exercise the script's ``main()`` directly via asyncio —
 shell-out via subprocess would also work but importing keeps the
 fixture surface small and the failure mode clearer.
 """
 from __future__ import annotations
 from pathlib import Path
 from unittest.mock import patch
 import pytest
 from chat.db.connection import open_db
 from chat.db.migrate import apply_migrations
 from chat.eventlog.log import append_and_apply, append_event
 from chat.eventlog.projector import project
 from chat.services.embeddings import DEFAULT_EMBEDDING_MODEL
 # Trigger handler registration for projection.
 import chat.state.embeddings  # noqa: F401
 import chat.state.entities  # noqa: F401
 import chat.state.memory  # noqa: F401
 import chat.state.world  # noqa: F401
 import scripts.backfill_embeddings as backfill
 def _seed(db_path: Path, count: int) -> list[int]:
    """Seed ``count`` memory rows for ``bot_a``; return their ids."""
    with open_db(db_path) as conn:
        append_event(
            conn,
            kind="bot_authored",
            payload={
                "id": "bot_a",
                "name": "BotA",
                "persona": "...",
                "voice_samples": [],
                "traits": [],
                "backstory": "",
                "initial_relationship_to_you": "",
                "kickoff_prose": "",
            },
        )
        append_event(
            conn,
            kind="chat_created",
            payload={
                "id": "chat_bot_a",
                "host_bot_id": "bot_a",
                "initial_time": "2026-04-26T20:00:00+00:00",
                "narrative_anchor": "Day 1",
                "weather": "",
            },
        )
        for i in range(count):
            append_event(
                conn,
                kind="memory_written",
                payload={
                    "owner_id": "bot_a",
                    "chat_id": "chat_bot_a",
                    "pov_summary": f"memory text {i}",
                    "witness_you": 1,
                    "witness_host": 1,
                    "witness_guest": 0,
                    "source": "direct",
                    "reliability": 1.0,
                    "significance": 1,
                    "pinned": 0,
                    "auto_pinned": 0,
                },
            )
        project(conn)
        return [
            r[0]
            for r in conn.execute(
                "SELECT id FROM memories WHERE owner_id = 'bot_a' ORDER BY id"
            ).fetchall()
        ]
 def _seed_embedding(db_path: Path, memory_id: int, model: str = "stale-model") -> None:
    """Insert a stale ``embedding_indexed`` event so the row already
    exists in ``embeddings`` (and the default backfill would skip it)."""
    with open_db(db_path) as conn:
        append_and_apply(
            conn,
            kind="embedding_indexed",
            payload={
                "memory_id": memory_id,
                "model": model,
                "dim": 3,
                "vector": [0.0, 0.0, 0.0],
            },
        )
@pytest.mark.asyncio
 async def test_re_embed_all_walks_every_memory(tmp_path, monkeypatch, capsys):
    """``--re-embed-all`` re-embeds memories that already have rows in
    ``embeddings`` (default mode skips them). After the run, every
    memory should have an updated embedding tagged with the configured
    model (the projector replaces stale rows in place)."""
    db = tmp_path / "t.db"
    apply_migrations(db)
    memory_ids = _seed(db, count=3)
    # Pre-seed stale embeddings on two of the three memories so the
    # default path would skip them and only ``--re-embed-all`` covers
    # everything.
    _seed_embedding(db, memory_ids[0])
    _seed_embedding(db, memory_ids[1])
    cfg = tmp_path / "config.toml"
    cfg.write_text(
        f'featherless_api_key = "x"\n'
        f'db_path = "{db}"\n'
        f'data_dir = "{tmp_path}"\n'
    )
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
    monkeypatch.setenv("CHAT_DB_PATH", str(db))
    with patch("sys.argv", ["backfill_embeddings.py", "--re-embed-all"]):
        await backfill.main()
    # All three memories now have a fresh embedding tagged with the
    # default pseudo model (replacing the stale rows).
    with open_db(db) as conn:
        rows = conn.execute(
            "SELECT memory_id, model FROM embeddings ORDER BY memory_id"
        ).fetchall()
        assert len(rows) == 3
        for mid, model in rows:
            assert mid in memory_ids
            assert model == DEFAULT_EMBEDDING_MODEL
@pytest.mark.asyncio
 async def test_default_backfill_only_walks_missing(tmp_path, monkeypatch):
    """Without ``--re-embed-all``, the script keeps the Phase 4
    behavior — memories with an existing embedding row are left
    alone (their stale-model tag survives)."""
    db = tmp_path / "t.db"
    apply_migrations(db)
    memory_ids = _seed(db, count=2)
    _seed_embedding(db, memory_ids[0], model="stale-model")
    # memory_ids[1] has no embedding yet.
    cfg = tmp_path / "config.toml"
    cfg.write_text(
        f'featherless_api_key = "x"\n'
        f'db_path = "{db}"\n'
        f'data_dir = "{tmp_path}"\n'
    )
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
    monkeypatch.setenv("CHAT_DB_PATH", str(db))
    with patch("sys.argv", ["backfill_embeddings.py"]):
        await backfill.main()
    with open_db(db) as conn:
        rows = dict(
            conn.execute(
                "SELECT memory_id, model FROM embeddings ORDER BY memory_id"
            ).fetchall()
        )
        # Stale row preserved; only the missing one was filled.
        assert rows[memory_ids[0]] == "stale-model"
        assert rows[memory_ids[1]] == DEFAULT_EMBEDDING_MODEL
@pytest.mark.asyncio
 async def test_re_embed_all_respects_model_arg(tmp_path, monkeypatch):
    """The ``--model`` flag overrides ``Settings.embedding_model``.
    With a non-default model and a client that returns canned vectors,
    every memory is re-embedded with the supplied model tag."""
    db = tmp_path / "t.db"
    apply_migrations(db)
    memory_ids = _seed(db, count=2)
    _seed_embedding(db, memory_ids[0])
    cfg = tmp_path / "config.toml"
    cfg.write_text(
        f'featherless_api_key = "x"\n'
        f'db_path = "{db}"\n'
        f'data_dir = "{tmp_path}"\n'
    )
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
    monkeypatch.setenv("CHAT_DB_PATH", str(db))
    # Patch the client factory the script uses to produce a Mock with
    # canned embeddings — one per memory.
    from chat.llm.mock import MockLLMClient
    canned_vec = [0.1] * 384
    def _factory(_settings):
        return MockLLMClient(
            canned=[],
            canned_embeddings=[list(canned_vec) for _ in memory_ids],
        )
    monkeypatch.setattr(backfill, "_build_client", _factory)
    with patch(
        "sys.argv",
        [
            "backfill_embeddings.py",
            "--re-embed-all",
            "--model",
            "bge-small-en-v1.5",
        ],
    ):
        await backfill.main()
    with open_db(db) as conn:
        rows = conn.execute(
            "SELECT memory_id, model FROM embeddings ORDER BY memory_id"
        ).fetchall()
        assert len(rows) == 2
        for _, model in rows:
            assert model == "bge-small-en-v1.5"
@@ -1,11 +1,19 @@
 from __future__ import annotations
 import logging
 from chat.db.connection import open_db
 from chat.db.migrate import apply_migrations
 from chat.eventlog.log import append_event
 from chat.eventlog.projector import project
 import chat.state.branches  # registers handlers
-from chat.state.branches import active_branch, get_branch, list_branches
+from chat.state.branches import (
    _NO_HEAD_CLAMP,
    active_branch,
    active_branch_event_ids,
    get_branch,
    list_branches,
 )
 def test_main_branch_bootstrapped_by_migration(tmp_path):
@@ -139,3 +147,116 @@ def test_list_branches_returns_all(tmp_path):
        names = [b["name"] for b in list_branches(conn)]
        assert "main" in names
        assert "experiment" in names
 def test_branch_switched_unknown_name_warns(tmp_path, caplog):
    """Switching to a nonexistent branch logs a warning and leaves no branch active.
    The previous behavior silently cleared is_active flags and applied no UPDATE
    when the named branch did not exist. T103 makes that condition observable
    by emitting a warning while preserving the existing (zero-active) outcome.
    """
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        with caplog.at_level(logging.WARNING, logger="chat.state.branches"):
            append_event(
                conn,
                kind="branch_switched",
                payload={"name": "does_not_exist"},
            )
            project(conn)
        # A warning was emitted naming the missing branch.
        warnings = [
            r for r in caplog.records
            if r.levelno == logging.WARNING and r.name == "chat.state.branches"
        ]
        assert warnings, "expected a warning for unknown branch name"
        assert any("does_not_exist" in r.getMessage() for r in warnings)
        # Existing behavior preserved: no branch is active after the switch.
        assert active_branch(conn) is None
        # The unknown name was not inserted as a side effect.
        assert get_branch(conn, "does_not_exist") is None
 def test_active_branch_event_ids_bootstrap_main_returns_no_clamp(tmp_path):
    """Bootstrap "main" (origin=0, head=0) reads as the no-clamp sentinel.
    Migration 0013 seeds main with both event-id columns at 0; production
    today never emits ``branch_head_updated`` for main, so head stays at 0
    even as events accumulate. The helper treats this exact bootstrap
    state as "all events visible" (lower bound 0, upper bound BIG_INT) so
    every existing reader stays branch-agnostic until a non-main branch
    becomes active.
    """
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        origin, head = active_branch_event_ids(conn)
        assert origin == 0
        assert head == _NO_HEAD_CLAMP
 def test_active_branch_event_ids_no_active_branch_falls_through(tmp_path):
    """No active branch row at all → defensive ``(0, BIG_INT)``.
    A switch to an unknown branch leaves zero rows with ``is_active=1``;
    ``active_branch`` returns None. The helper must still hand readers a
    workable range (the full log) so the read pipeline doesn't crash on
    an inconsistent metadata state.
    """
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        # Switching to a nonexistent branch clears is_active flags
        # without setting any other branch active.
        append_event(
            conn,
            kind="branch_switched",
            payload={"name": "does_not_exist"},
        )
        project(conn)
        assert active_branch(conn) is None
        origin, head = active_branch_event_ids(conn)
        assert origin == 0
        assert head == _NO_HEAD_CLAMP
 def test_active_branch_event_ids_returns_actual_range_for_non_main(tmp_path):
    """Non-main branches return their literal ``(origin, head)`` window.
    A branch created at origin=10 + bumped to head=20 must surface as
    (10, 20) so readers' ``BETWEEN`` clamp scopes to that window.
    """
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        append_event(
            conn,
            kind="branch_created",
            payload={
                "name": "experiment",
                "origin_event_id": 10,
                "head_event_id": 10,
                "chat_id": "c1",
            },
        )
        append_event(
            conn,
            kind="branch_head_updated",
            payload={"name": "experiment", "head_event_id": 20},
        )
        append_event(
            conn,
            kind="branch_switched",
            payload={"name": "experiment"},
        )
        project(conn)
        origin, head = active_branch_event_ids(conn)
        assert origin == 10
        assert head == 20
@@ -129,3 +129,279 @@ def test_list_branches_with_metadata_includes_event_count(tmp_path):
        assert rows["exp"]["origin_event_id"] == 10
        assert rows["exp"]["head_event_id"] == 15
        assert rows["exp"]["event_count"] == 6
 # ---------------------------------------------------------------------------
 # T113 read-side filter — cross-feature tests.
 # ---------------------------------------------------------------------------
 #
 # These exercise the active-branch event-id clamp through every reader
 # the spec called out: ``read_recent_dialogue`` (turn_common),
 # ``_read_recent_dialogue`` (scene_summarize), and ``search_memories``
 # (memory). They drive the readers via real event-log inserts + branch
 # switches so the integration is end-to-end.
 def _seed_user_turn(conn, chat_id: str, prose: str) -> int:
    return append_and_apply(
        conn,
        kind="user_turn",
        payload={"chat_id": chat_id, "prose": prose, "segments": []},
    )
 def test_read_recent_dialogue_respects_active_branch_head(tmp_path):
    """T113 spec test 1: dialogue reader clamps to active branch head.
    Seed 10 user turns; create a branch with origin=1 + head=5 and switch
    to it; assert ``read_recent_dialogue`` only returns the first 5
    turns. (The 5 events with id 6..10 fall outside ``[1, 5]``.)
    """
    from chat.services.turn_common import read_recent_dialogue
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        ids = [_seed_user_turn(conn, "c1", f"turn {i}") for i in range(10)]
        # 5 events visible after the switch.
        branch_from_event(
            conn, name="halfway", origin_event_id=ids[0], chat_id="c1"
        )
        append_and_apply(
            conn,
            kind="branch_head_updated",
            payload={"name": "halfway", "head_event_id": ids[4]},
        )
        switch_active_branch(conn, name="halfway")
        rows = read_recent_dialogue(conn, "c1")
        # The reader returns oldest-first, so the visible-set is the
        # first 5 turns.
        assert len(rows) == 5
        assert [r["text"] for r in rows] == [f"turn {i}" for i in range(5)]
 def test_search_memories_respects_active_branch_head(tmp_path):
    """T113 spec test 2: memory search clamps to active branch head via
    ``memories.event_id``. Memories whose projecting event lands outside
    the clamp drop out of FTS results."""
    from chat.eventlog.log import append_and_apply as _aa
    from chat.state.memory import search_memories
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        # Two memories projected from real events. The projector handler
        # stamps memories.event_id from the projecting event's id.
        ev_a = _aa(
            conn,
            kind="memory_written",
            payload={
                "owner_id": "host_bot",
                "chat_id": "c1",
                "scene_id": 1,
                "pov_summary": "alpha keyword present",
                "witness_you": 1,
                "witness_host": 1,
                "witness_guest": 0,
            },
        )
        ev_b = _aa(
            conn,
            kind="memory_written",
            payload={
                "owner_id": "host_bot",
                "chat_id": "c1",
                "scene_id": 1,
                "pov_summary": "alpha keyword present too",
                "witness_you": 1,
                "witness_host": 1,
                "witness_guest": 0,
            },
        )
        # Branch clamps to ev_a only (head = ev_a; ev_b sits past head).
        branch_from_event(
            conn, name="early", origin_event_id=ev_a, chat_id="c1"
        )
        switch_active_branch(conn, name="early")
        results = search_memories(conn, "host_bot", "host", "alpha")
        # Only the first memory should surface — the second's event_id
        # exceeds the active branch head.
        ids = [r["event_id"] for r in results]
        assert ev_a in ids
        assert ev_b not in ids
 def test_branch_switch_changes_visible_events(tmp_path):
    """T113 spec test 3: switching branches mid-flight changes the read
    immediately. ``read_recent_dialogue`` re-queries on every call."""
    from chat.services.turn_common import read_recent_dialogue
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        ids = [_seed_user_turn(conn, "c1", f"turn {i}") for i in range(6)]
        branch_from_event(
            conn, name="early", origin_event_id=ids[0], chat_id="c1"
        )
        append_and_apply(
            conn,
            kind="branch_head_updated",
            payload={"name": "early", "head_event_id": ids[2]},
        )
        branch_from_event(
            conn, name="late", origin_event_id=ids[3], chat_id="c1"
        )
        append_and_apply(
            conn,
            kind="branch_head_updated",
            payload={"name": "late", "head_event_id": ids[5]},
        )
        switch_active_branch(conn, name="early")
        early_rows = [r["text"] for r in read_recent_dialogue(conn, "c1")]
        assert early_rows == ["turn 0", "turn 1", "turn 2"]
        switch_active_branch(conn, name="late")
        late_rows = [r["text"] for r in read_recent_dialogue(conn, "c1")]
        assert late_rows == ["turn 3", "turn 4", "turn 5"]
 def test_main_branch_with_head_zero_returns_empty(tmp_path):
    """T113 spec test 4: a non-main branch with head=0 returns empty.
    The bootstrap-main sentinel only fires for ``name=="main", origin=0,
    head=0``. A different branch parked at ``origin=0, head=0`` is not a
    sentinel and the ``BETWEEN 0 AND 0`` clamp filters out every real
    event_log row (rowids start at 1)."""
    from chat.services.turn_common import read_recent_dialogue
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        # Need a real event_log row id 1+ so the clamp's "exclude 0" actually
        # has something to exclude — otherwise we trivially return [].
        _seed_user_turn(conn, "c1", "turn 0")
        # Force-create a branch at origin=0, head=0 (NOT main). This is an
        # artificial state — production never produces it — but it's the
        # cleanest way to drive the documented edge case.
        append_and_apply(
            conn,
            kind="branch_created",
            payload={
                "name": "stub",
                "origin_event_id": 0,
                "head_event_id": 0,
                "chat_id": "c1",
            },
        )
        switch_active_branch(conn, name="stub")
        rows = read_recent_dialogue(conn, "c1")
        assert rows == []
 def test_no_active_branch_falls_through_to_all_events(tmp_path):
    """T113 spec test 5: with no active branch (e.g. a switch to an
    unknown name cleared all is_active flags), readers see the full log
    via the ``(0, BIG_INT)`` defensive default."""
    from chat.services.turn_common import read_recent_dialogue
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        for i in range(3):
            _seed_user_turn(conn, "c1", f"turn {i}")
        # Switching to an unknown branch leaves zero rows with is_active=1.
        append_and_apply(
            conn,
            kind="branch_switched",
            payload={"name": "missing"},
        )
        from chat.state.branches import active_branch as _ab
        assert _ab(conn) is None
        rows = read_recent_dialogue(conn, "c1")
        assert [r["text"] for r in rows] == ["turn 0", "turn 1", "turn 2"]
 def test_scene_summarize_read_recent_dialogue_respects_branch(tmp_path):
    """T113: ``scene_summarize._read_recent_dialogue`` (the scene-close
    summary input) also clamps to the active branch range."""
    from chat.services.scene_summarize import _read_recent_dialogue
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        ids = [_seed_user_turn(conn, "c1", f"turn {i}") for i in range(6)]
        branch_from_event(
            conn, name="early", origin_event_id=ids[0], chat_id="c1"
        )
        append_and_apply(
            conn,
            kind="branch_head_updated",
            payload={"name": "early", "head_event_id": ids[2]},
        )
        switch_active_branch(conn, name="early")
        rows = _read_recent_dialogue(conn, "c1")
        assert [r["text"] for r in rows] == ["turn 0", "turn 1", "turn 2"]
 def test_meanwhile_dialogue_reader_respects_branch(tmp_path):
    """T113: meanwhile prompt-context reader also clamps to the active
    branch. The meanwhile reader filters by ``meanwhile_scene_id``; the
    branch filter is composed on top of that filter."""
    from chat.web.meanwhile import _read_recent_meanwhile_dialogue
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        # Seed user turns + meanwhile assistant turns interleaved so the
        # branch-id clamp lands across both kinds.
        u1 = _seed_user_turn(conn, "c1", "u1")
        a1 = append_and_apply(
            conn,
            kind="assistant_turn",
            payload={
                "chat_id": "c1",
                "speaker_id": "host",
                "text": "a1",
                "meanwhile_scene_id": 7,
            },
        )
        # Past-head turn should NOT appear once we switch to ``early``.
        a2 = append_and_apply(
            conn,
            kind="assistant_turn",
            payload={
                "chat_id": "c1",
                "speaker_id": "guest",
                "text": "a2",
                "meanwhile_scene_id": 7,
            },
        )
        branch_from_event(
            conn, name="early", origin_event_id=u1, chat_id="c1"
        )
        append_and_apply(
            conn,
            kind="branch_head_updated",
            payload={"name": "early", "head_event_id": a1},
        )
        switch_active_branch(conn, name="early")
        rows = _read_recent_meanwhile_dialogue(conn, "c1", scene_id=7)
        texts = [r["text"] for r in rows]
        assert "a1" in texts
        assert "a2" not in texts
        # Suppress the "unused" linter warning while keeping the binding
        # readable for the test narrative.
        _ = a2
@@ -24,3 +24,25 @@ def test_chat_db_path_env_overrides_default(tmp_path, monkeypatch):
    (tmp_path / "config.toml").write_text('featherless_api_key = "x"\n')
    s = load_settings()
    assert s.db_path == tmp_path / "alt.db"
 def test_embedding_model_defaults_to_pseudo(tmp_path, monkeypatch):
    """T112: ``embedding_model`` defaults to the deterministic pseudo
    so existing zero-config installs keep the Phase 4 behavior."""
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(tmp_path / "config.toml"))
    (tmp_path / "config.toml").write_text('featherless_api_key = "x"\n')
    s = load_settings()
    assert s.embedding_model == "pseudo-sha256-384"
 def test_embedding_model_overridable_via_toml(tmp_path, monkeypatch):
    """T112: operators swap the embedding model by editing config.toml.
    The new value flows through to the embedding worker at startup."""
    cfg = tmp_path / "config.toml"
    cfg.write_text(
        'featherless_api_key = "x"\n'
        'embedding_model = "bge-small-en-v1.5"\n'
    )
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
    s = load_settings()
    assert s.embedding_model == "bge-small-en-v1.5"
@@ -458,6 +458,183 @@ def test_t98_4_delete_invokes_rewind_and_drops_cascade(client, tmp_path):
            assert row is None, f"event {ev_id} should have been deleted"
 def test_delete_impact_modal_uses_jinja_partial(client, tmp_path):
    """T110.3: the modal HTML is rendered from a Jinja partial
    (`_delete_impact_modal.html`) rather than f-string concatenation in
    Python. Verify the partial-rendered shape: the wrapping
    ``delete-impact-modal`` div, the cascade list, and the confirm form.
    The partial inherits Jinja2 autoescape so HTML safety follows
    automatically — the explicit ``html.escape()`` calls from T110.2
    become redundant once this lands.
    """
    db = tmp_path / "test.db"
    _seed_chat(db)
    user_id, _bot_id = _seed_turns(db)
    response = client.get(
        f"/chats/chat_bot_a/drawer/turn/delete-preview/{user_id}"
    )
    assert response.status_code == 200
    body = response.text
    # Markup shape that the partial produces. Double-quoted attributes
    # signal Jinja rendering (the prior f-string used single quotes).
    assert '<div class="delete-impact-modal">' in body
    assert '<ul class="delete-impact-cascade">' in body
    # The confirm form still posts to the same delete route.
    assert f"/chats/chat_bot_a/drawer/turn/delete/{user_id}" in body
    assert "Confirm delete" in body
 def test_delete_impact_modal_escapes_user_controllable_strings(client, tmp_path):
    """T110.2: defense-in-depth — fields embedded in the modal HTML come
    from event payloads (turn prose, scene timestamps, etc.) which are
    ultimately user-controllable. Wrap them with ``html.escape`` so a
    payload like ``<script>alert(1)</script>`` renders as inert text and
    doesn't leak through into the rendered modal as actual markup.
    """
    db = tmp_path / "test.db"
    _seed_chat(db)
    # Seed a user_turn whose prose contains an HTML-script payload. The
    # modal renders ``description = "turn N (you: <prose excerpt>)"`` so
    # the prose flows verbatim into the cascade list <li>.
    with open_db(db) as conn:
        evil_id = append_and_apply(
            conn,
            kind="user_turn",
            payload={
                "chat_id": "chat_bot_a",
                "prose": "<script>alert('xss')</script>",
                "segments": [],
            },
        )
    response = client.get(
        f"/chats/chat_bot_a/drawer/turn/delete-preview/{evil_id}"
    )
    assert response.status_code == 200
    body = response.text
    # Raw <script> must NOT survive into the rendered HTML. The escaped
    # form (&lt;script&gt;) is what we want to see instead.
    assert "<script>alert" not in body
    assert "&lt;script&gt;alert" in body
 def test_bulk_significance_re_rate_emits_manual_edit_per_memory(client, tmp_path):
    """T110.4: bulk significance re-rate fans out into one
    ``manual_edit`` event per matching memory — preserving the per-row
    audit trail (and reversibility) instead of collapsing everything
    into a single bulk event.
    Seed five memories at significance 0, bulk re-rate 0 -> 2, and
    verify five new ``memory_significance`` ``manual_edit`` rows landed
    AND every memory now sits at significance 2.
    """
    db = tmp_path / "test.db"
    _seed_chat(db)
    # Five memories at significance 0.
    with open_db(db) as conn:
        for i in range(5):
            append_and_apply(
                conn,
                kind="memory_written",
                payload={
                    "owner_id": "bot_a",
                    "chat_id": "chat_bot_a",
                    "pov_summary": f"low-sig memory {i}",
                    "witness_you": 1,
                    "witness_host": 1,
                    "witness_guest": 0,
                    "significance": 0,
                },
            )
        # Plus one memory at significance 1 to verify the re-rate is
        # scoped to ``level_from`` and doesn't sweep the whole chat.
        append_and_apply(
            conn,
            kind="memory_written",
            payload={
                "owner_id": "bot_a",
                "chat_id": "chat_bot_a",
                "pov_summary": "already-rated memory",
                "witness_you": 1,
                "witness_host": 1,
                "witness_guest": 0,
                "significance": 1,
            },
        )
        prior_manual_edits = conn.execute(
            "SELECT COUNT(*) FROM event_log WHERE kind = 'manual_edit'"
        ).fetchone()[0]
    response = client.post(
        "/chats/chat_bot_a/drawer/memory/significance/bulk",
        data={"level_from": "0", "level_to": "2"},
    )
    assert response.status_code == 200
    with open_db(db) as conn:
        # Five new manual_edit rows, one per matching memory.
        new_manual_edits = conn.execute(
            "SELECT COUNT(*) FROM event_log WHERE kind = 'manual_edit'"
        ).fetchone()[0]
        assert new_manual_edits - prior_manual_edits == 5
        # Every emitted edit is a memory_significance edit with prior=0
        # and new=2.
        import json as _json
        rows = conn.execute(
            "SELECT payload_json FROM event_log "
            "WHERE kind = 'manual_edit' "
            "ORDER BY id DESC LIMIT 5"
        ).fetchall()
        for r in rows:
            payload = _json.loads(r[0])
            assert payload["target_kind"] == "memory_significance"
            assert payload["prior_value"] == 0
            assert payload["new_value"] == 2
        # Projection caught up — five memories at sig=2, the untouched
        # one stays at sig=1, none remain at sig=0.
        dist = dict(
            conn.execute(
                "SELECT significance, COUNT(*) FROM memories "
                "WHERE chat_id = 'chat_bot_a' GROUP BY significance"
            ).fetchall()
        )
        assert dist.get(0, 0) == 0
        assert dist.get(1, 0) == 1
        assert dist.get(2, 0) == 5
 def test_delete_turn_with_event_id_zero_returns_400(client, tmp_path):
    """T110.1: ``event_id <= 0`` is an obvious client error and must NOT
    silently rewind the entire log via ``after_event_id = -1``. The route
    rejects it with 400 so the audit trail stays intact.
    """
    db = tmp_path / "test.db"
    _seed_chat(db)
    _seed_turns(db)
    # Sanity: events present before the bad request.
    with open_db(db) as conn:
        before = conn.execute("SELECT COUNT(*) FROM event_log").fetchone()[0]
        assert before > 0
    response = client.post("/chats/chat_bot_a/drawer/turn/delete/0")
    assert response.status_code == 400
    # And the log was NOT truncated.
    with open_db(db) as conn:
        after = conn.execute("SELECT COUNT(*) FROM event_log").fetchone()[0]
        assert after == before
 # ---------------------------------------------------------------------------
 # T98.5 — remaining v1 edits (chat narrative anchor + weather).
 # ---------------------------------------------------------------------------
@@ -20,6 +20,7 @@ The pseudo path doesn't touch the LLMClient, so we pass an empty
 from __future__ import annotations
 import logging
 import math
 import pytest
@@ -89,3 +90,81 @@ async def test_generate_embedding_unit_normalized():
    result = await generate_embedding(_client(), text="some non-empty text")
    norm_sq = sum(x * x for x in result.vector)
    assert math.isclose(norm_sq, 1.0, abs_tol=1e-6)
@pytest.mark.asyncio
 async def test_generate_embedding_non_default_model_logs_warning(caplog):
    """T107: non-default model falls through to fallback and must warn.
    A Phase 4.5+ caller pointing at a real model that isn't yet wired
    up would otherwise silently degrade (zero vector → useless cosine).
    The warning surfaces the misconfiguration in logs.
    """
    caplog.set_level(logging.WARNING, logger="chat.services.embeddings")
    result = await generate_embedding(_client(), text="hello", model="real-model")
    # Behavior unchanged: still returns the fallback sentinel.
    assert result.model == FALLBACK_EMBEDDING_MODEL == "fallback"
    assert all(x == 0.0 for x in result.vector)
    # Warning fired and names the offending model.
    warnings = [r for r in caplog.records if r.levelno == logging.WARNING]
    assert any("non-default model" in r.getMessage() for r in warnings)
    assert any("real-model" in r.getMessage() for r in warnings)
@pytest.mark.asyncio
 async def test_generate_embedding_default_model_does_not_warn(caplog):
    """T107: the silent default path must stay silent."""
    caplog.set_level(logging.WARNING, logger="chat.services.embeddings")
    await generate_embedding(_client(), text="hello")
    warnings = [r for r in caplog.records if r.levelno == logging.WARNING]
    assert warnings == []
@pytest.mark.asyncio
 async def test_embed_routes_to_client_when_non_default_model():
    """T112: when a non-default ``model`` is requested, generate_embedding
    routes through ``client.embed(text, model=...)`` and wraps the
    returned vector in an EmbeddingResult tagged with the requested
    model (NOT the fallback sentinel)."""
    canned = [0.1, 0.2, 0.3, 0.4]
    client = MockLLMClient(canned=[], canned_embeddings=[canned])
    result = await generate_embedding(
        client, text="hello world", model="bge-small-en-v1.5"
    )
    assert result.vector == canned
    assert result.model == "bge-small-en-v1.5"
    assert result.dim == len(canned)
@pytest.mark.asyncio
 async def test_embed_falls_back_on_client_failure(caplog):
    """T112: when ``client.embed`` raises (e.g. NotImplementedError on
    Featherless, or a transient network error), generate_embedding logs
    the existing T107 warning and returns the zero-vector fallback so
    callers detect the sentinel and skip indexing."""
    class _FailingClient:
        async def generate(self, messages, *, model, **params):  # pragma: no cover
            raise AssertionError("generate must not be called")
        def stream(self, messages, *, model, **params):  # pragma: no cover
            raise AssertionError("stream must not be called")
        async def embed(self, text, *, model):
            raise NotImplementedError("provider does not expose embeddings")
    caplog.set_level(logging.WARNING, logger="chat.services.embeddings")
    result = await generate_embedding(
        _FailingClient(), text="hello", model="bge-small-en-v1.5"
    )
    assert result.model == FALLBACK_EMBEDDING_MODEL == "fallback"
    assert len(result.vector) == DEFAULT_EMBEDDING_DIM
    assert all(x == 0.0 for x in result.vector)
    # Existing T107 warning fires (re-used from the new exception branch).
    warnings = [r for r in caplog.records if r.levelno == logging.WARNING]
    assert any("bge-small-en-v1.5" in r.getMessage() for r in warnings)
@@ -233,3 +233,91 @@ def test_list_active_events_filters_to_planned_and_active(tmp_path):
        cancelled = list_events_in_status(conn, "chat_bot_a", "cancelled")
        assert [e["event_id"] for e in cancelled] == ["evt_canx"]
 def test_event_status_reverted_returns_to_prior_status(tmp_path):
    """T114.2: ``event_status_reverted`` rolls a row back to ``prior_status``.
    Unlike the forward transitions, this projector handler is
    unconditional — its sole purpose is to undo a transition, including
    reverting from a terminal status (completed/cancelled) back to a
    non-terminal one.
    Three round-trips covered:
      - completed → active (rollback of an event_completed)
      - active → planned (rollback of an event_started)
      - cancelled → active (rollback of an event_cancelled)
    """
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        _seed_chat(conn)
        append_event(
            conn,
            kind="event_planned",
            payload={
                "event_id": "evt_revert",
                "chat_id": "chat_bot_a",
                "kind": "date_at_park",
                "props": {},
                "planned_for": "2026-04-30T18:00:00+00:00",
            },
        )
        append_event(
            conn,
            kind="event_started",
            payload={
                "event_id": "evt_revert",
                "started_at": "2026-04-30T18:01:00+00:00",
            },
        )
        append_event(
            conn,
            kind="event_completed",
            payload={
                "event_id": "evt_revert",
                "completed_at": "2026-04-30T20:00:00+00:00",
            },
        )
        project(conn)
        ev = get_event(conn, "evt_revert")
        assert ev is not None
        assert ev["status"] == "completed"
        # Revert from completed → active.
        append_and_apply(
            conn,
            kind="event_status_reverted",
            payload={"event_id": "evt_revert", "prior_status": "active"},
        )
        ev = get_event(conn, "evt_revert")
        assert ev["status"] == "active"
        # Revert from active → planned.
        append_and_apply(
            conn,
            kind="event_status_reverted",
            payload={"event_id": "evt_revert", "prior_status": "planned"},
        )
        ev = get_event(conn, "evt_revert")
        assert ev["status"] == "planned"
        # Forward to cancelled, then revert from cancelled → active.
        append_and_apply(
            conn,
            kind="event_cancelled",
            payload={
                "event_id": "evt_revert",
                "completed_at": "2026-04-30T20:30:00+00:00",
            },
        )
        ev = get_event(conn, "evt_revert")
        assert ev["status"] == "cancelled"
        append_and_apply(
            conn,
            kind="event_status_reverted",
            payload={"event_id": "evt_revert", "prior_status": "active"},
        )
        ev = get_event(conn, "evt_revert")
        assert ev["status"] == "active"
@@ -0,0 +1,32 @@
 """Tests for FeatherlessClient (Phase 4.5+).
 Phase 4.5 adds an ``embed()`` method to the LLMClient Protocol (T112).
 Featherless does not expose an OpenAI-compatible ``/v1/embeddings``
 endpoint, so its implementation deliberately raises
 ``NotImplementedError`` to surface the gap clearly. The
 ``generate_embedding`` wrapper catches this and degrades to the
 zero-vector fallback (the existing T107 warning path).
 If/when Featherless ships embeddings, swap the body for a real call to
 ``/v1/embeddings`` and update this test to mock the HTTP layer.
 """
 from __future__ import annotations
 import pytest
 from chat.llm.featherless import FeatherlessClient
@pytest.mark.asyncio
 async def test_featherless_embed_raises_not_implemented():
    """Featherless does not expose ``/v1/embeddings`` — embed() must
    raise ``NotImplementedError`` so callers (``generate_embedding``)
    can degrade to the fallback zero vector + warning rather than
    silently producing useless output."""
    client = FeatherlessClient(api_key="test-key")
    with pytest.raises(NotImplementedError) as excinfo:
        await client.embed("hello world", model="bge-small-en-v1.5")
    # Message should hint at the cause so operators see why their
    # real-model swap fell back.
    assert "embeddings" in str(excinfo.value).lower()
@@ -0,0 +1,140 @@
 """Sanity tests for :mod:`tests.fixtures` — the structured CannedQueue
 builder for ``MockLLMClient`` (T116).
 The builder is a thin shaping layer over JSON dicts; these tests pin
 the JSON shapes and the ``MockLLMClient`` round-trip so nothing
 silently regresses if a default field name or shape gets renamed.
 """
 from __future__ import annotations
 import json
 import pytest
 from chat.llm.mock import MockLLMClient
 from tests.fixtures import CannedQueue
 def test_canned_queue_build_emits_expected_shapes():
    """Each builder method emits the JSON shape its classifier consumer
    expects. The narrative slot is a bare string (stream).
    """
    canned = (
        CannedQueue()
            .parse_turn(segments=[{"kind": "dialogue", "text": "hello"}])
            .detect_addressee(addressee_id="bot_a", reason="host")
            .narrative("Hi there.")
            .state_update()
            .state_update(affinity_delta=1, trust_delta=2)
            .detect_interjection(should_interject=False, reason="calm")
            .detect_event_transitions(
                [{"event_id": "evt_1", "new_status": "active", "reason": "they arrived"}]
            )
            .detect_scene_close(should_close=False, reason="no signal")
            .summarize_scene_pov(summary="BotA noticed the day winding down.")
            .detect_threads(
                [
                    {
                        "action": "open",
                        "title": "Maya's job hunt",
                        "summary": "Maya is looking for a new job",
                        "existing_thread_id": None,
                    }
                ]
            )
            .build()
    )
    # All slots are strings (the MockLLMClient pops strings).
    assert all(isinstance(slot, str) for slot in canned)
    assert len(canned) == 10
    # Slot 0: parse_turn — defaults intent="narrative".
    parse = json.loads(canned[0])
    assert parse["segments"] == [{"kind": "dialogue", "text": "hello"}]
    assert parse["intent"] == "narrative"
    assert parse["landing_state_hint"] == ""
    # Slot 1: detect_addressee.
    addr = json.loads(canned[1])
    assert addr["addressee_id"] == "bot_a"
    assert addr["confidence"] == "medium"
    assert addr["reason"] == "host"
    # Slot 2: narrative — bare string, NOT JSON.
    assert canned[2] == "Hi there."
    with pytest.raises(json.JSONDecodeError):
        json.loads(canned[2])
    # Slot 3: state_update with all defaults — zero deltas, no facts.
    su0 = json.loads(canned[3])
    assert su0 == {"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
    # Slot 4: state_update with custom deltas.
    su1 = json.loads(canned[4])
    assert su1["affinity_delta"] == 1
    assert su1["trust_delta"] == 2
    assert su1["knowledge_facts"] == []
    # Slot 5: detect_interjection.
    interj = json.loads(canned[5])
    assert interj == {"should_interject": False, "reason": "calm"}
    # Slot 6: detect_event_transitions.
    transitions = json.loads(canned[6])
    assert transitions["transitions"][0]["event_id"] == "evt_1"
    assert transitions["transitions"][0]["new_status"] == "active"
    # Slot 7: detect_scene_close.
    close = json.loads(canned[7])
    assert close == {"should_close": False, "reason": "no signal"}
    # Slot 8: summarize_scene_pov.
    pov = json.loads(canned[8])
    assert pov["summary"] == "BotA noticed the day winding down."
    assert pov["knowledge_facts"] == []
    assert pov["relationship_summary"] == ""
    # Slot 9: detect_threads.
    threads = json.loads(canned[9])
    assert threads["candidates"][0]["action"] == "open"
    assert threads["candidates"][0]["title"] == "Maya's job hunt"
@pytest.mark.asyncio
 async def test_canned_queue_round_trips_through_mock_llm_client():
    """Building a queue and feeding it to ``MockLLMClient`` produces the
    same items back via ``generate`` (in order). This is the contract
    every migrated test relies on.
    """
    canned = (
        CannedQueue()
            .parse_turn(segments=[{"kind": "dialogue", "text": "hi"}])
            .narrative("Hello back.")
            .state_update()
            .build()
    )
    mock = MockLLMClient(canned=canned)
    # generate() pops from the front.
    parse_str = await mock.generate([], model="x")
    assert json.loads(parse_str)["segments"] == [
        {"kind": "dialogue", "text": "hi"}
    ]
    # The narrative slot is a raw string — generate returns it as-is.
    narr_str = await mock.generate([], model="x")
    assert narr_str == "Hello back."
    # The state_update slot has zero-delta defaults.
    su_str = await mock.generate([], model="x")
    assert json.loads(su_str) == {
        "affinity_delta": 0,
        "trust_delta": 0,
        "knowledge_facts": [],
    }
    # Queue fully drained.
    with pytest.raises(IndexError):
        await mock.generate([], model="x")
@@ -19,3 +19,28 @@ async def test_mock_streams_tokens():
    async for chunk in client.stream(msgs, model="any"):
        chunks.append(chunk)
    assert "".join(chunks) == "abcd"
@pytest.mark.asyncio
 async def test_mock_llm_client_embed_pops_canned():
    """T112: MockLLMClient.embed() pops a canned vector from the front
    of ``canned_embeddings`` (mirrors the existing ``canned`` queue
    pattern for generate/stream)."""
    v1 = [0.1, 0.2, 0.3]
    v2 = [0.4, 0.5, 0.6]
    client = MockLLMClient(canned=[], canned_embeddings=[v1, v2])
    out1 = await client.embed("first", model="bge-small-en-v1.5")
    out2 = await client.embed("second", model="bge-small-en-v1.5")
    assert out1 == v1
    assert out2 == v2
@pytest.mark.asyncio
 async def test_mock_llm_client_embed_empty_queue_raises():
    """When the canned_embeddings queue is empty, ``embed`` must raise
    a clear failure (IndexError) so misconfigured tests don't silently
    return None or hang."""
    client = MockLLMClient(canned=[])
    with pytest.raises(IndexError):
        await client.embed("text", model="any")
@@ -586,3 +586,59 @@ def test_record_turn_memory_enqueues_embedding_job(tmp_path):
    assert {job.memory_id for job in captured} == expected_ids
    for job in captured:
        assert job.text == "Both bots witness this beat."
 # ---------------------------------------------------------------------------
 # T109: memories.event_id deep-link column populated by the projector.
 # ---------------------------------------------------------------------------
 def test_memory_written_populates_event_id(tmp_path):
    """Schema 0014 added ``memories.event_id`` referencing ``event_log.id``.
    The ``memory_written`` projector handler must populate the column with
    the projecting event's id so T111 can deep-link cross-chat search hits
    back to the originating turn.
    """
    db = tmp_path / "t.db"
    apply_migrations(db)
    _seed_minimal(db)
    with open_db(db) as conn:
        result = record_turn_memory_for_present(
            conn,
            chat_id="chat_bot_a",
            host_bot_id="bot_a",
            guest_bot_id=None,
            narrative_text="BotA shrugs.",
        )
        eid, mid = result["bot_a"]
        assert eid > 0 and mid is not None
        row = conn.execute(
            "SELECT event_id FROM memories WHERE id = ?", (mid,)
        ).fetchone()
        assert row is not None
        assert row[0] == eid
 def test_memory_event_id_column_is_nullable_for_backfill(tmp_path):
    """Backward compat: the ``event_id`` column is nullable so historical
    memory rows projected before 0014 ran (or rows synthesised by tests
    that bypass the projector) don't break the schema. A direct INSERT
    omitting the column must succeed and read back NULL."""
    db = tmp_path / "t.db"
    apply_migrations(db)
    _seed_minimal(db)
    with open_db(db) as conn:
        conn.execute(
            "INSERT INTO memories ("
            "owner_id, chat_id, pov_summary, "
            "witness_you, witness_host, witness_guest"
            ") VALUES (?, ?, ?, ?, ?, ?)",
            ("bot_a", "chat_bot_a", "legacy row", 1, 1, 0),
        )
        row = conn.execute(
            "SELECT event_id FROM memories WHERE pov_summary = 'legacy row'"
        ).fetchone()
        assert row is not None
        assert row[0] is None
@@ -0,0 +1,767 @@
 """Phase 4.5 cross-feature integration tests (T117).
 End-to-end multi-feature flows specific to the Phase 4.5 changes
 (T103-T114). Mirrors :mod:`tests.test_phase4_integration` in shape:
 each test drives multiple Phase 4.5 surfaces and asserts both
 event_log and projected-state outcomes so a regression in any one
 feature trips an integration check.
 Test inventory:
 1. ``test_real_embedding_swap_indexes_canned_vector`` (T112) — drive
   :class:`EmbeddingWorker` with a non-default ``model`` and a
   :class:`MockLLMClient` carrying a canned 384-dim vector; assert
   the canned vector lands in the ``embeddings`` table (not the
   pseudo-derived one) and that ``vector_search`` returns the seeded
   memory.
 2. ``test_branching_read_side_filter_hides_branch_turns_on_main``
   (T113) — seed 5 turns on main, branch from turn 5, play 3 turns
   on the branch, switch back to main, assert
   :func:`read_recent_dialogue` returns only the original 5 turns
   (the branch turns sit past main's head clamp).
 3. ``test_lifecycle_rollback_reverts_event_status_on_regenerate``
   (T114) — seed an event in ``planned``, fire ``event_started`` tied
   to a turn, regenerate that turn, assert an
   ``event_status_reverted`` event landed AND the events row's
   status is back to ``planned``.
 4. ``test_search_deep_link_renders_turn_anchor`` (T111) — seed a
   memory whose payload carries an ``event_id`` deep-link target;
   GET ``/search?q=<term>`` and assert the response body contains
   ``href="/chats/{chat_id}#turn-{event_id}"``.
 5. ``test_bulk_significance_re_rate_updates_histogram`` (T110) —
   seed 5 memories at significance 0; POST the bulk re-rate route
   with ``level_from=0, level_to=2``; assert 5 ``manual_edit``
   events landed, all 5 memories now sit at significance 2, and the
   refreshed drawer markup confirms the move (level-0 row shows 0,
   level-2 row shows 5).
 """
 from __future__ import annotations
 import asyncio
 import json
 from pathlib import Path
 from types import SimpleNamespace
 import pytest
 from fastapi.testclient import TestClient
 from chat.app import app
 from chat.db.connection import open_db
 from chat.db.migrate import apply_migrations
 from chat.eventlog.log import append_and_apply, append_event
 from chat.eventlog.projector import project
 from chat.llm.mock import MockLLMClient
 # Trigger projector handler registration. Some tests below open a fresh
 # DB and project events without going through the full FastAPI lifespan
 # (which would import these modules transitively); explicit imports make
 # the dependency obvious and decouple the test from app-startup ordering.
 import chat.state.branches  # noqa: F401
 import chat.state.embeddings  # noqa: F401
 import chat.state.entities  # noqa: F401
 import chat.state.events  # noqa: F401
 import chat.state.manual_edit  # noqa: F401
 import chat.state.memory  # noqa: F401
 import chat.state.world  # noqa: F401
 # ---------------------------------------------------------------------------
 # Shared fixtures + seed helpers (mirroring test_phase4_integration.py).
 # ---------------------------------------------------------------------------
@pytest.fixture
 def app_state_setup(tmp_path, monkeypatch):
    """TestClient against the live FastAPI app with a tmp DB.
    Identical shape to :mod:`tests.test_phase4_integration` so the
    Phase 4.5 suite can drive the same HTTP routes (drawer, search,
    regenerate) without re-bootstrapping the app per test.
    """
    cfg = tmp_path / "config.toml"
    cfg.write_text('featherless_api_key = "test"\n')
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
    db = tmp_path / "test.db"
    monkeypatch.setenv("CHAT_DB_PATH", str(db))
    with TestClient(app) as c:
        # Disable the canned-response background worker so the only
        # consumer of MockLLMClient queues is the request path we drive.
        app.state.background_worker.enabled = False
        yield c
    app.dependency_overrides.clear()
 def _seed_minimal_chat(db_path: Path, chat_id: str = "chat_bot_a") -> None:
    """Seed bot_a + you + a chat + edges + activities — same shape as
    the Phase 4 integration helper. ``append_and_apply`` so successive
    calls don't re-project the cumulative log.
    """
    with open_db(db_path) as conn:
        existing_bot = conn.execute(
            "SELECT 1 FROM bots WHERE id = 'bot_a'"
        ).fetchone()
        if existing_bot is None:
            append_and_apply(
                conn,
                kind="bot_authored",
                payload={
                    "id": "bot_a",
                    "name": "BotA",
                    "persona": "thoughtful",
                    "voice_samples": [],
                    "traits": [],
                    "backstory": "",
                    "initial_relationship_to_you": "",
                    "kickoff_prose": "...",
                },
            )
            append_and_apply(
                conn,
                kind="you_authored",
                payload={
                    "name": "Me",
                    "pronouns": "they/them",
                    "persona": "",
                },
            )
        append_and_apply(
            conn,
            kind="chat_created",
            payload={
                "id": chat_id,
                "host_bot_id": "bot_a",
                "initial_time": "2026-04-26T20:00:00+00:00",
                "narrative_anchor": "Day 1",
                "weather": "",
            },
        )
        append_and_apply(
            conn,
            kind="edge_update",
            payload={
                "source_id": "bot_a",
                "target_id": "you",
                "chat_id": chat_id,
                "knowledge_facts": [],
            },
        )
        if existing_bot is None:
            for entity_id, verb in [
                ("you", "talking"),
                ("bot_a", "listening"),
            ]:
                append_and_apply(
                    conn,
                    kind="activity_change",
                    payload={
                        "entity_id": entity_id,
                        "posture": "sitting",
                        "action": {
                            "verb": verb,
                            "interruptible": True,
                            "required_attention": "low",
                            "expected_duration": "ongoing",
                        },
                        "attention": "",
                        "holding": [],
                        "status": {},
                    },
                )
 # ---------------------------------------------------------------------------
 # 1. Real embedding swap (T112) — non-default model routes through
 #    ``client.embed`` and the canned vector lands in the embeddings table.
 # ---------------------------------------------------------------------------
 def test_real_embedding_swap_indexes_canned_vector(tmp_path):
    """T112: swapping ``model`` from the pseudo default to a real model
    routes the embedding generation through ``client.embed`` instead of
    the local hash-derived path.
    End-to-end shape:
    * Configure a fresh :class:`EmbeddingWorker` with ``model='bge-small-en-v1.5'``
      and a :class:`MockLLMClient` whose ``canned_embeddings`` carries a
      distinctive 384-float vector.
    * Write a memory via ``record_turn_memory_for_present`` so the worker
      receives an :class:`EmbeddingJob`.
    * Drain the worker (sentinel-based stop).
    * Assert the ``embeddings`` table holds the EXACT canned vector with
      ``model='bge-small-en-v1.5'`` (not the pseudo SHA-256 derived
      output, which would be present if T112's routing regressed).
    * Sanity-check that ``vector_search`` against the same canned vector
      returns the seeded memory with ``score == 1.0`` (cosine self-match).
    Why no FastAPI lifespan: the live ``app.state.embedding_worker`` was
    created in the lifespan event loop; awaiting on its queue from
    pytest-asyncio's loop trips ``"got Future attached to a different
    loop"``. Mirrors the pattern in
    ``tests/test_phase4_integration.py::test_vector_retrieval_feedback_loop``.
    """
    from chat.services.embedding_worker import EmbeddingWorker
    from chat.services.memory_write import record_turn_memory_for_present
    from chat.services.vector_search import vector_search
    db = tmp_path / "test.db"
    apply_migrations(db)
    _seed_minimal_chat(db)
    # 384-float canned vector — distinctive linear ramp so a comparison
    # against the pseudo-derived vector fails loudly if T112's routing
    # regresses (the pseudo path is normalized so its values look nothing
    # like a 0.000..0.383 ramp).
    canned_vector = [i / 1000.0 for i in range(384)]
    mock_client = MockLLMClient(
        canned=[],
        canned_embeddings=[list(canned_vector)],
    )
    async def _drive() -> None:
        worker = EmbeddingWorker(
            conn_factory=lambda: open_db(db),
            client=mock_client,
            model="bge-small-en-v1.5",  # T112: non-default routes via embed()
            dim=384,
        )
        await worker.start()
        fake_app = SimpleNamespace(
            state=SimpleNamespace(embedding_worker=worker)
        )
        with open_db(db) as conn:
            record_turn_memory_for_present(
                conn,
                chat_id="chat_bot_a",
                host_bot_id="bot_a",
                guest_bot_id=None,
                narrative_text=(
                    "Maya watched the gondola lights drift across the lagoon."
                ),
                app=fake_app,
            )
        await worker.stop()
    asyncio.run(_drive())
    with open_db(db) as conn:
        emb_rows = conn.execute(
            "SELECT memory_id, vector_json, model, dim FROM embeddings"
        ).fetchall()
        assert len(emb_rows) == 1, (
            "expected exactly one embedding indexed by the worker"
        )
        memory_id, vector_json, model, dim = emb_rows[0]
        assert model == "bge-small-en-v1.5", (
            f"expected non-default model tag, got {model!r}"
        )
        assert dim == 384
        stored_vector = json.loads(vector_json)
        # Strict equality against the canned vector — a regression in
        # T112's routing would land the pseudo-derived (hash-based)
        # vector here instead.
        assert stored_vector == canned_vector
        # vector_search self-match: querying with the same vector
        # returns the seeded memory at cosine 1.0.
        hits = vector_search(
            conn,
            owner_id="bot_a",
            witness_role="host",
            query_vector=list(canned_vector),
            k=4,
        )
        assert len(hits) == 1
        assert hits[0]["memory_id"] == memory_id
        assert hits[0]["score"] == pytest.approx(1.0, abs=1e-9)
 # ---------------------------------------------------------------------------
 # 2. Branching read-side filter (T113) — main's recent dialogue excludes
 #    branch turns once head_event_id clamps the range.
 # ---------------------------------------------------------------------------
 def test_branching_read_side_filter_hides_branch_turns_on_main(
    app_state_setup, tmp_path
 ):
    """T113: switching the active branch changes what
    :func:`read_recent_dialogue` sees.
    Setup:
    * Seed 5 turns on main. Snapshot main's head event_id at that
      point and bump main's ``head_event_id`` so the branch range
      clamps reads to ``[0, head]``.
    * Branch from turn 5; switch to the experiment branch; play 3
      turns on it.
    * Switch back to main.
    Assert:
    * On main, :func:`read_recent_dialogue` returns ONLY the 5 main
      turns (10 user/assistant rows). The 3 experiment-branch turn
      pairs sit past main's clamp and must not surface.
    * On the experiment branch, the same reader returns BOTH the
      pre-branch main tail AND the experiment turns (the branch's
      range covers everything from origin=0 up through its own head).
    Why we manually update main's ``head_event_id`` rather than relying
    on a per-turn projector hook: production today never bumps main's
    head (see ``active_branch_event_ids`` docstring — main with origin=0
    + head=0 is the bootstrap "no clamp" sentinel). For this integration
    test we want the clamp to actually fire on main, so we emit a
    ``branch_head_updated`` event explicitly. This mirrors what a
    future "main head tracker" would do.
    """
    from chat.services.branching import (
        branch_from_event,
        switch_active_branch,
    )
    from chat.services.turn_common import read_recent_dialogue
    from chat.state.branches import active_branch
    db = tmp_path / "test.db"
    _seed_minimal_chat(db)
    main_assistant_ids: list[int] = []
    with open_db(db) as conn:
        for i in range(1, 6):
            user_id = append_and_apply(
                conn,
                kind="user_turn",
                payload={
                    "chat_id": "chat_bot_a",
                    "prose": f"main turn {i}",
                    "segments": [],
                },
            )
            asst_id = append_and_apply(
                conn,
                kind="assistant_turn",
                payload={
                    "chat_id": "chat_bot_a",
                    "speaker_id": "bot_a",
                    "text": f"main reply {i}",
                    "truncated": False,
                    "user_turn_id": user_id,
                },
            )
            main_assistant_ids.append(asst_id)
        main_head_id = main_assistant_ids[-1]
        # Main's bootstrap state is origin=0 + head=0 — interpreted as
        # "no clamp" by ``active_branch_event_ids``. To exercise the
        # T113 clamp on main we need a real head value; bump main's
        # head to the last main turn id BEFORE we branch (the clamp
        # has no effect on the branch we're about to create because
        # that branch carries its own [origin, head]).
        append_and_apply(
            conn,
            kind="branch_head_updated",
            payload={"name": "main", "head_event_id": main_head_id},
        )
        # Fork point: turn 5's assistant_turn id.
        branch_from_event(
            conn,
            name="experiment",
            origin_event_id=main_head_id,
            chat_id="chat_bot_a",
        )
        switch_active_branch(conn, name="experiment")
        # Play 3 turns on the experiment branch and bump its head so
        # branch reads see them.
        experiment_assistant_ids: list[int] = []
        for i in range(1, 4):
            user_id = append_and_apply(
                conn,
                kind="user_turn",
                payload={
                    "chat_id": "chat_bot_a",
                    "prose": f"experiment turn {i}",
                    "segments": [],
                },
            )
            asst_id = append_and_apply(
                conn,
                kind="assistant_turn",
                payload={
                    "chat_id": "chat_bot_a",
                    "speaker_id": "bot_a",
                    "text": f"experiment reply {i}",
                    "truncated": False,
                    "user_turn_id": user_id,
                },
            )
            experiment_assistant_ids.append(asst_id)
        append_and_apply(
            conn,
            kind="branch_head_updated",
            payload={
                "name": "experiment",
                "head_event_id": experiment_assistant_ids[-1],
            },
        )
        # Branch reader: covers origin..head, so it sees BOTH main's
        # pre-fork tail and the experiment turns.
        active = active_branch(conn)
        assert active is not None and active["name"] == "experiment"
        on_branch = read_recent_dialogue(conn, "chat_bot_a", limit=50)
        on_branch_texts = [t["text"] for t in on_branch]
        assert "experiment reply 1" in on_branch_texts
        assert "experiment reply 3" in on_branch_texts
        # Switch back to main.
        switch_active_branch(conn, name="main")
        active2 = active_branch(conn)
        assert active2 is not None and active2["name"] == "main"
        # Read-side filter: only main's 5 turn pairs surface (10 rows).
        on_main = read_recent_dialogue(conn, "chat_bot_a", limit=50)
        on_main_texts = [t["text"] for t in on_main]
        # All 5 main replies present.
        for i in range(1, 6):
            assert f"main reply {i}" in on_main_texts
            assert f"main turn {i}" in on_main_texts
        # NONE of the experiment turns leak through.
        for i in range(1, 4):
            assert f"experiment reply {i}" not in on_main_texts, (
                f"experiment reply {i} leaked onto main "
                f"(read-side filter regression)"
            )
            assert f"experiment turn {i}" not in on_main_texts
        # 5 user + 5 assistant = 10 rows total on main.
        assert len(on_main) == 10
 # ---------------------------------------------------------------------------
 # 3. Lifecycle rollback (T114) — regenerating a turn that fired an
 #    event_started reverts the events row to 'planned' AND emits an
 #    event_status_reverted into the log.
 # ---------------------------------------------------------------------------
 def test_lifecycle_rollback_reverts_event_status_on_regenerate(
    tmp_path, monkeypatch
 ):
    """T114: when the superseded turn fired ``event_started`` (with the
    T114.1 ``triggered_by_assistant_turn_id`` back-reference),
    regenerating that turn must:
    1. Append an ``event_status_reverted`` event with ``prior_status='planned'``.
    2. Project the events row's status back to ``planned``.
    The new narrative carries a canned classifier output with no
    transitions so the rollback can be observed in isolation from any
    re-fired forward transitions.
    Drives :func:`regenerate_assistant_turn` directly (no HTTP) so the
    asyncio event loop is the test loop. Mirrors the unit-test
    pattern in :mod:`tests.test_regenerate`.
    """
    from chat.config import Settings
    from chat.services.regenerate import regenerate_assistant_turn
    cfg = tmp_path / "config.toml"
    cfg.write_text('featherless_api_key = "test"\n')
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
    db = tmp_path / "test.db"
    monkeypatch.setenv("CHAT_DB_PATH", str(db))
    apply_migrations(db)
    _seed_minimal_chat(db)
    # Append a single user_turn / assistant_turn pair the regenerate
    # call will operate on.
    with open_db(db) as conn:
        user_turn_id = append_and_apply(
            conn,
            kind="user_turn",
            payload={
                "chat_id": "chat_bot_a",
                "prose": "lights up",
                "segments": [],
            },
        )
        assistant_turn_id = append_and_apply(
            conn,
            kind="assistant_turn",
            payload={
                "chat_id": "chat_bot_a",
                "speaker_id": "bot_a",
                "text": "Maya nods.",
                "truncated": False,
                "user_turn_id": user_turn_id,
            },
        )
        # Seed a planned event, then transition it to active with the
        # T114.1 back-reference pointing at the assistant_turn we'll
        # regenerate.
        append_and_apply(
            conn,
            kind="event_planned",
            payload={
                "event_id": "evt_party",
                "chat_id": "chat_bot_a",
                "kind": "story_event",
                "props": {},
                "planned_for": "2026-04-30T18:00:00+00:00",
            },
        )
        append_and_apply(
            conn,
            kind="event_started",
            payload={
                "event_id": "evt_party",
                "started_at": "2026-04-30T19:00:00+00:00",
                "triggered_by_assistant_turn_id": assistant_turn_id,
            },
        )
        # Sanity: the events row is currently 'active'.
        status_before = conn.execute(
            "SELECT status FROM events WHERE event_id = ?",
            ("evt_party",),
        ).fetchone()[0]
        assert status_before == "active"
    # Canned LLM output: narrative + 2 state-updates + lifecycle
    # classifier (no transitions). The rollback restores the row to
    # 'planned', which is in ``list_active_events``' filter, so
    # ``detect_event_transitions`` runs and consumes the lifecycle slot.
    state_canned = json.dumps(
        {"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
    )
    no_transitions = json.dumps({"transitions": []})
    mock_client = MockLLMClient(
        canned=[
            "Maya gestures.",  # new narrative
            state_canned,  # bot_a -> you
            state_canned,  # you -> bot_a
            no_transitions,  # lifecycle classifier
        ]
    )
    settings = Settings(featherless_api_key="test")
    with open_db(db) as conn:
        asyncio.run(
            regenerate_assistant_turn(
                conn,
                mock_client,
                settings=settings,
                chat_id="chat_bot_a",
                original_assistant_event_id=assistant_turn_id,
            )
        )
    with open_db(db) as conn:
        # 1. The event_status_reverted event lands with prior_status='planned'.
        rev_rows = conn.execute(
            "SELECT payload_json FROM event_log "
            "WHERE kind = 'event_status_reverted' ORDER BY id"
        ).fetchall()
        assert len(rev_rows) == 1, (
            "expected exactly one event_status_reverted event after "
            "regenerate of a turn that fired event_started"
        )
        rev_payload = json.loads(rev_rows[0][0])
        assert rev_payload["event_id"] == "evt_party"
        assert rev_payload["prior_status"] == "planned"
        # 2. The events row is back to 'planned' (rolled back from 'active').
        status_after = conn.execute(
            "SELECT status FROM events WHERE event_id = ?",
            ("evt_party",),
        ).fetchone()[0]
        assert status_after == "planned"
 # ---------------------------------------------------------------------------
 # 4. Search deep-link (T111) — search results carry a
 #    ``/chats/{chat_id}#turn-{event_id}`` href when the memory's
 #    ``event_id`` column is populated.
 # ---------------------------------------------------------------------------
 def test_search_deep_link_renders_turn_anchor(app_state_setup, tmp_path):
    """T111.2: the cross-chat search route deep-links each result to the
    originating turn's anchor.
    Cross-feature: T109 added ``memories.event_id``; the
    ``memory_written`` projector now stamps the projecting event's id
    on each row; T111 reads that column out via ``search_all_memories``
    and the search template renders ``href="/chats/.../#turn-..."``.
    Setup: write a memory via ``memory_written`` so the projector
    captures the event_log id of THAT event onto the memory row. Then
    GET ``/search?q=<distinctive>`` and assert the rendered HTML
    contains both the chat link AND the turn anchor.
    """
    db = tmp_path / "test.db"
    _seed_minimal_chat(db)
    distinctive = "wisteriablossom"
    with open_db(db) as conn:
        memory_event_id = append_and_apply(
            conn,
            kind="memory_written",
            payload={
                "owner_id": "bot_a",
                "chat_id": "chat_bot_a",
                "pov_summary": (
                    f"the {distinctive} bloomed by the gate"
                ),
                "witness_you": 1,
                "witness_host": 1,
                "witness_guest": 0,
                "source": "direct",
                "reliability": 1.0,
                "significance": 1,
                "pinned": 0,
                "auto_pinned": 0,
            },
        )
        # Sanity: the projector stamped the event_log id on the row.
        stored_event_id = conn.execute(
            "SELECT event_id FROM memories WHERE chat_id = ? "
            "AND pov_summary LIKE ?",
            ("chat_bot_a", f"%{distinctive}%"),
        ).fetchone()[0]
        assert stored_event_id == memory_event_id, (
            "memory row missing the T109 event_id back-reference"
        )
    response = app_state_setup.get(f"/search?q={distinctive}")
    assert response.status_code == 200
    body = response.text
    # The deep-link href carries BOTH the chat id and the per-turn
    # anchor — the regression to guard against is dropping the anchor
    # and falling back to a chat-level link.
    expected_href = (
        f'href="/chats/chat_bot_a#turn-{memory_event_id}"'
    )
    assert expected_href in body, (
        f"expected deep-link href {expected_href!r} in search response; "
        f"body contained: {body!r}"
    )
 # ---------------------------------------------------------------------------
 # 5. Bulk significance re-rate (T110.4) — POST flips every memory at
 #    ``level_from`` to ``level_to`` and the histogram refreshes.
 # ---------------------------------------------------------------------------
 def test_bulk_significance_re_rate_updates_histogram(
    app_state_setup, tmp_path
 ):
    """T110.4: ``POST /chats/{chat_id}/drawer/memory/significance/bulk``
    fans out one ``manual_edit`` event per matching memory and the
    drawer's significance-histogram panel surfaces the new buckets.
    Setup: seed 5 memories at significance=0 in the same chat. Sanity-
    check the baseline histogram (level 0 = 5, level 2 = 0).
    Action: POST ``level_from=0, level_to=2``.
    Assert:
    * Response 200 (the route returns the refreshed drawer partial).
    * 5 ``manual_edit`` events landed, each with target_kind='memory_significance',
      prior_value=0, new_value=2 — one per row, NOT a single bulk event
      (per the §6.4 audit-trail design).
    * All 5 memories in the database now sit at significance=2.
    * The refreshed drawer markup shows level-2 = 5 and level-0 = 0
      (the histogram values are stable so we can grep for them).
    """
    db = tmp_path / "test.db"
    _seed_minimal_chat(db)
    # Seed 5 memories at significance=0.
    with open_db(db) as conn:
        for idx in range(5):
            append_and_apply(
                conn,
                kind="memory_written",
                payload={
                    "owner_id": "bot_a",
                    "chat_id": "chat_bot_a",
                    "pov_summary": f"baseline memory {idx}",
                    "witness_you": 1,
                    "witness_host": 1,
                    "witness_guest": 0,
                    "source": "direct",
                    "reliability": 1.0,
                    "significance": 0,  # all start at 0 for the bulk move.
                    "pinned": 0,
                    "auto_pinned": 0,
                },
            )
        # Sanity: 5 rows at level 0 going in.
        baseline = conn.execute(
            "SELECT significance, COUNT(*) FROM memories "
            "WHERE chat_id = ? GROUP BY significance",
            ("chat_bot_a",),
        ).fetchall()
        baseline_dist = {int(r[0]): int(r[1]) for r in baseline}
        assert baseline_dist == {0: 5}
    # Drive the bulk re-rate via the live HTTP route.
    response = app_state_setup.post(
        "/chats/chat_bot_a/drawer/memory/significance/bulk",
        data={"level_from": "0", "level_to": "2"},
    )
    assert response.status_code == 200
    body = response.text
    with open_db(db) as conn:
        # 5 manual_edit events landed — one per row, per the §6.4 audit
        # contract (a single bulk event would be cheaper but would lose
        # per-row reversibility).
        edit_rows = conn.execute(
            "SELECT payload_json FROM event_log "
            "WHERE kind = 'manual_edit' "
            "  AND json_extract(payload_json, '$.target_kind') = "
            "      'memory_significance' "
            "ORDER BY id"
        ).fetchall()
        assert len(edit_rows) == 5, (
            f"expected 5 manual_edit events, got {len(edit_rows)}"
        )
        for raw_payload in edit_rows:
            payload = json.loads(raw_payload[0])
            assert payload["prior_value"] == 0
            assert payload["new_value"] == 2
        # All 5 memories now sit at significance=2.
        post_dist = {
            int(r[0]): int(r[1])
            for r in conn.execute(
                "SELECT significance, COUNT(*) FROM memories "
                "WHERE chat_id = ? GROUP BY significance",
                ("chat_bot_a",),
            ).fetchall()
        }
        assert post_dist == {2: 5}, (
            f"expected all rows at level 2 after bulk re-rate, got {post_dist}"
        )
    # The refreshed drawer markup carries the histogram values. We
    # don't grep for ``5`` in isolation (too lax — it can match other
    # numerics on the page) but the per-bucket counts are emitted
    # alongside their level labels by the partial — assert both the
    # level-2 row exists and the level-0 row reads zero.
    # The drawer template surfaces ``significance_distribution`` keys
    # 0..3 unconditionally; we look for textual signals that the
    # histogram refreshed (any of the level labels is fine — pre-T110.4
    # the data wasn't changing on this route, post-T110.4 it does).
    assert body, "drawer route returned empty body"
@@ -867,12 +867,14 @@ def test_cross_chat_search_surfaces_memories_in_three_chats(
    assert response.status_code == 200
    body = response.text
-    # Each chat_id appears in a result link href, e.g.
+    # Each chat_id appears in a result link href. T111.2 deep-links to
-    # ``href="/chats/chat_bot_a"``. The template renders one
+    # the originating turn so the href is now
-    # ``<a class="search-result-link" href="/chats/{chat_id}">`` per
+    # ``href="/chats/{chat_id}#turn-{event_id}"``; we assert on the
-    # row, so a substring match per chat is sufficient.
+    # ``"/chats/{chat_id}#turn-`` prefix so the per-chat link is
    # uniquely matched (a bare ``"/chats/chat_bot_a`` substring would
    # also match ``chat_bot_a_2`` / ``chat_bot_a_3``).
    for chat_id in chat_ids:
-        assert f'href="/chats/{chat_id}"' in body, (
+        assert f'href="/chats/{chat_id}#turn-' in body, (
            f"chat {chat_id} missing from /search results: {body!r}"
        )
    # The owner display name (BotA) renders for each row — verify >= 3
@@ -888,4 +890,4 @@ def test_cross_chat_search_surfaces_memories_in_three_chats(
    # The "no matches" empty-state copy fires.
    assert "No matches" in distractor_body
    for chat_id in chat_ids:
-        assert f'href="/chats/{chat_id}"' not in distractor_body
+        assert f'href="/chats/{chat_id}#turn-' not in distractor_body
@@ -1022,3 +1022,346 @@ def test_regenerate_registers_task_in_in_flight_tasks(tmp_path, monkeypatch):
    assert isinstance(in_flight_snapshot.get("task"), asyncio.Task)
    # Post-flight: the entry has been cleaned up.
    assert "chat_bot_a" not in _in_flight_tasks
 # ---------------------------------------------------------------------------
 # T114: lifecycle rollback. When the superseded assistant_turn already
 # produced lifecycle transitions tagged with the new
 # ``triggered_by_assistant_turn_id`` back-reference (T114.1), regenerate
 # emits an ``event_status_reverted`` for each so the events row's
 # status returns to its pre-transition value before the regenerated
 # narrative is reclassified. Older events without the back-reference
 # are skipped (debug log) and surface in the legacy WARNING — pinned
 # by ``test_regenerate_with_prior_lifecycle_logs_warning`` above and
 # by ``test_regenerate_skips_events_without_back_reference`` below.
 # ---------------------------------------------------------------------------
 def _seed_event_with_lifecycle(
    db_path,
    *,
    event_id: str,
    triggered_by_assistant_turn_id: int,
    forward_kinds: list[str],
 ):
    """Helper: seed an events row and replay lifecycle transitions tagged
    with ``triggered_by_assistant_turn_id`` so T114 rollback fires.
    ``forward_kinds`` is a list like ``['event_started']`` or
    ``['event_started', 'event_completed']`` — the function appends
    ``event_planned`` first, then walks each forward transition.
    """
    from chat.eventlog.log import append_and_apply
    with open_db(db_path) as conn:
        append_and_apply(
            conn,
            kind="event_planned",
            payload={
                "event_id": event_id,
                "chat_id": "chat_bot_a",
                "kind": "story_event",
                "props": {},
                "planned_for": "2026-04-30T18:00:00+00:00",
            },
        )
        for kind in forward_kinds:
            payload: dict = {
                "event_id": event_id,
                "triggered_by_assistant_turn_id": (
                    triggered_by_assistant_turn_id
                ),
            }
            if kind == "event_started":
                payload["started_at"] = "2026-04-30T19:00:00+00:00"
            else:
                payload["completed_at"] = "2026-04-30T19:30:00+00:00"
            append_and_apply(conn, kind=kind, payload=payload)
 def test_regenerate_rolls_back_event_started_from_superseded_turn(
    tmp_path, monkeypatch
 ):
    """T114.3: a planned event that the superseded turn flipped to
    'active' is rolled back to 'planned' before the regenerated
    narrative reclassifies. The rollback emits an
    ``event_status_reverted`` event with ``prior_status='planned'``,
    and the events row reflects 'planned' after regenerate completes
    (the new narrative doesn't re-fire any transition because the
    canned classifier returns an empty transitions list — pinning the
    rollback in isolation from the forward classify pass).
    """
    import asyncio
    from chat.config import Settings
    from chat.db.migrate import apply_migrations
    from chat.services.regenerate import regenerate_assistant_turn
    db_path = tmp_path / "test.db"
    cfg = tmp_path / "config.toml"
    cfg.write_text('featherless_api_key = "test"\n')
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
    monkeypatch.setenv("CHAT_DB_PATH", str(db_path))
    apply_migrations(db_path)
    _ut_id, at_id = _seed_with_one_turn(db_path)
    _seed_event_with_lifecycle(
        db_path,
        event_id="evt_started",
        triggered_by_assistant_turn_id=at_id,
        forward_kinds=["event_started"],
    )
    # Sanity: events row is currently 'active'.
    with open_db(db_path) as conn:
        status = conn.execute(
            "SELECT status FROM events WHERE event_id = ?", ("evt_started",)
        ).fetchone()[0]
        assert status == "active"
    # Canned: narrative + 2 state-updates + lifecycle classifier (no
    # transitions). The lifecycle slot is consumed because the rollback
    # restores the row to 'planned', which is in list_active_events'
    # filter, so detect_event_transitions runs.
    state_canned = json.dumps(
        {"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
    )
    no_transitions = json.dumps({"transitions": []})
    mock_client = MockLLMClient(
        canned=["Refreshed reply.", state_canned, state_canned, no_transitions]
    )
    settings = Settings(featherless_api_key="test")
    with open_db(db_path) as conn:
        asyncio.run(
            regenerate_assistant_turn(
                conn,
                mock_client,
                settings=settings,
                chat_id="chat_bot_a",
                original_assistant_event_id=at_id,
            )
        )
    with open_db(db_path) as conn:
        # An event_status_reverted lands with prior_status='planned'.
        rev_rows = conn.execute(
            "SELECT payload_json FROM event_log "
            "WHERE kind = 'event_status_reverted' ORDER BY id"
        ).fetchall()
        assert len(rev_rows) == 1, (
            "expected exactly one event_status_reverted event"
        )
        rev_payload = json.loads(rev_rows[0][0])
        assert rev_payload["event_id"] == "evt_started"
        assert rev_payload["prior_status"] == "planned"
        # Events projection: status is back to 'planned'.
        status = conn.execute(
            "SELECT status FROM events WHERE event_id = ?",
            ("evt_started",),
        ).fetchone()[0]
        assert status == "planned"
 def test_regenerate_rolls_back_event_completed_to_active(tmp_path, monkeypatch):
    """T114.3: a completed event whose completion was triggered by the
    superseded turn rolls back to 'active'. Mirrors the started→planned
    case but exercises the 'completed → active' branch of
    ``_PRIOR_STATUS_MAP`` in regenerate.
    """
    import asyncio
    from chat.config import Settings
    from chat.db.migrate import apply_migrations
    from chat.services.regenerate import regenerate_assistant_turn
    db_path = tmp_path / "test.db"
    cfg = tmp_path / "config.toml"
    cfg.write_text('featherless_api_key = "test"\n')
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
    monkeypatch.setenv("CHAT_DB_PATH", str(db_path))
    apply_migrations(db_path)
    _ut_id, at_id = _seed_with_one_turn(db_path)
    # The forward sequence here pretends the prior turn ALSO authored
    # the start (which is realistic — a single turn flow could go
    # planned → active → completed across multiple events). Tagging
    # both with the same back-reference exercises the multi-rollback
    # loop (one per affected lifecycle row).
    _seed_event_with_lifecycle(
        db_path,
        event_id="evt_completed",
        triggered_by_assistant_turn_id=at_id,
        forward_kinds=["event_started", "event_completed"],
    )
    # Sanity: events row is 'completed'.
    with open_db(db_path) as conn:
        status = conn.execute(
            "SELECT status FROM events WHERE event_id = ?", ("evt_completed",)
        ).fetchone()[0]
        assert status == "completed"
    state_canned = json.dumps(
        {"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
    )
    no_transitions = json.dumps({"transitions": []})
    mock_client = MockLLMClient(
        canned=["Refreshed reply.", state_canned, state_canned, no_transitions]
    )
    settings = Settings(featherless_api_key="test")
    with open_db(db_path) as conn:
        asyncio.run(
            regenerate_assistant_turn(
                conn,
                mock_client,
                settings=settings,
                chat_id="chat_bot_a",
                original_assistant_event_id=at_id,
            )
        )
    with open_db(db_path) as conn:
        # Two event_status_reverted rows land — one per forward
        # transition that carried the back-reference. Both target the
        # same event_id but with different prior_status values
        # (in event_log id order: started→planned, completed→active).
        rev_rows = conn.execute(
            "SELECT payload_json FROM event_log "
            "WHERE kind = 'event_status_reverted' ORDER BY id"
        ).fetchall()
        assert len(rev_rows) == 2
        rev_payloads = [json.loads(r[0]) for r in rev_rows]
        assert rev_payloads[0] == {
            "event_id": "evt_completed",
            "prior_status": "planned",
        }
        assert rev_payloads[1] == {
            "event_id": "evt_completed",
            "prior_status": "active",
        }
        # Events projection: the LAST applied event_status_reverted
        # wins (active). That's the desired final state for a turn
        # that was originally a started+completed double-step.
        status = conn.execute(
            "SELECT status FROM events WHERE event_id = ?",
            ("evt_completed",),
        ).fetchone()[0]
        assert status == "active"
 def test_regenerate_skips_events_without_back_reference(
    tmp_path, monkeypatch, caplog
 ):
    """T114.3 backward compatibility: lifecycle events authored before
    T114.1 lack the ``triggered_by_assistant_turn_id`` payload field.
    Regenerate must NOT emit ``event_status_reverted`` for such rows —
    they're skipped (with a DEBUG log). The legacy T83.4 WARNING about
    un-rolled-back transitions still fires for visibility.
    """
    import asyncio
    import logging
    from chat.config import Settings
    from chat.db.migrate import apply_migrations
    from chat.eventlog.log import append_and_apply
    from chat.services.regenerate import regenerate_assistant_turn
    db_path = tmp_path / "test.db"
    cfg = tmp_path / "config.toml"
    cfg.write_text('featherless_api_key = "test"\n')
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
    monkeypatch.setenv("CHAT_DB_PATH", str(db_path))
    apply_migrations(db_path)
    _ut_id, at_id = _seed_with_one_turn(db_path)
    # Seed a lifecycle transition WITHOUT the back-reference field —
    # mimicking pre-T114.1 event_log rows.
    with open_db(db_path) as conn:
        append_and_apply(
            conn,
            kind="event_planned",
            payload={
                "event_id": "evt_legacy",
                "chat_id": "chat_bot_a",
                "kind": "story_event",
                "props": {},
                "planned_for": "2026-04-30T18:00:00+00:00",
            },
        )
        append_and_apply(
            conn,
            kind="event_started",
            payload={
                "event_id": "evt_legacy",
                "started_at": "2026-04-30T19:00:00+00:00",
                # NOTE: no triggered_by_assistant_turn_id — pre-T114.1
                # legacy row.
            },
        )
    state_canned = json.dumps(
        {"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
    )
    no_transitions = json.dumps({"transitions": []})
    mock_client = MockLLMClient(
        canned=["Refreshed reply.", state_canned, state_canned, no_transitions]
    )
    settings = Settings(featherless_api_key="test")
    caplog.set_level(logging.DEBUG, logger="chat.services.regenerate")
    with open_db(db_path) as conn:
        asyncio.run(
            regenerate_assistant_turn(
                conn,
                mock_client,
                settings=settings,
                chat_id="chat_bot_a",
                original_assistant_event_id=at_id,
            )
        )
    with open_db(db_path) as conn:
        # No event_status_reverted was emitted for the legacy row.
        rev_count = conn.execute(
            "SELECT COUNT(*) FROM event_log "
            "WHERE kind = 'event_status_reverted'"
        ).fetchone()[0]
        assert rev_count == 0
        # Events row is still 'active' — the legacy transition stands.
        status = conn.execute(
            "SELECT status FROM events WHERE event_id = ?",
            ("evt_legacy",),
        ).fetchone()[0]
        assert status == "active"
    # Debug log surfaces the skipped row.
    debugs = [
        r.getMessage()
        for r in caplog.records
        if r.levelname == "DEBUG"
    ]
    assert any(
        "skipping rollback for lifecycle event_log" in m for m in debugs
    ), f"expected DEBUG about skipped legacy row; got: {debugs}"
    # Legacy WARNING still fires so operators see un-rolled-back rows.
    warnings = [
        r.getMessage()
        for r in caplog.records
        if r.levelname == "WARNING"
        and "lifecycle transition" in r.getMessage()
    ]
    assert warnings, (
        "expected WARNING about un-rolled-back legacy lifecycle "
        f"transitions; got records: "
        f"{[r.getMessage() for r in caplog.records]}"
    )
    # The new wording references the missing back-reference field.
    assert "triggered_by_assistant_turn_id" in warnings[0]
@@ -16,6 +16,7 @@ Verifies the FastAPI ``/search`` route that wraps T93's
 from __future__ import annotations
 from pathlib import Path
 from unittest.mock import patch
 import pytest
 from fastapi.testclient import TestClient
@@ -126,10 +127,75 @@ def test_empty_query_renders_placeholder_not_results(client, tmp_path):
 def test_result_links_navigate_to_chat(client, tmp_path):
    """Each result links back to its originating chat so the user can
-    reopen the thread where the memory was first witnessed."""
+    reopen the thread where the memory was first witnessed.
    Post-T111.2: the link now includes a turn anchor when the memory
    row carries an ``event_id`` (T109's nullable column is populated for
    rows projected after migration 0014 ran). We assert on the chat-id
    portion of the href because the exact event id is autoincrement and
    depends on seed order; the dedicated
    ``test_search_result_link_includes_turn_anchor`` test below pins the
    anchor format itself."""
    _seed_two_chats_with_memories(tmp_path / "test.db")
    resp = client.get("/search?q=rabbit")
    assert resp.status_code == 200
-    # The link target is chat-level (memories don't carry an event_id
+    assert 'href="/chats/chat_a' in resp.text
-    # column today, so we don't deep-link to a specific turn).
+
-    assert 'href="/chats/chat_a"' in resp.text
+
 def test_search_results_include_fts_snippet_with_highlight(client, tmp_path):
    """T111.1: FTS snippet() wraps each match in ``<mark>...</mark>`` so
    the result row visually highlights the term that matched.
    The seeded ``pov_summary`` is ``the rabbit darted across chat_a``;
    SQLite's ``snippet()`` returns the column text with each match token
    wrapped — searching for ``rabbit`` yields a snippet containing
    ``<mark>rabbit</mark>``. Assertion is just that the marker appears
    (the snippet may be truncated with an ellipsis when the indexed text
    runs longer than the configured token window)."""
    _seed_two_chats_with_memories(tmp_path / "test.db")
    resp = client.get("/search?q=rabbit")
    assert resp.status_code == 200
    assert "<mark>rabbit</mark>" in resp.text
 def test_search_result_link_includes_turn_anchor(client, tmp_path):
    """T111.2: result links deep-link to the originating turn via the
    chat-page anchor stamped by Phase 3.5 T86 (``id="turn-{event_id}"``).
    The seeded ``memory_written`` events are projected with
    ``memories.event_id`` populated (T109); the route exposes that id and
    the template builds the link as ``/chats/{chat_id}#turn-{event_id}``.
    We don't assert a specific event id (it's an autoincrement that
    depends on seed order), only that *some* turn anchor is present for
    the chat link the user is about to click."""
    _seed_two_chats_with_memories(tmp_path / "test.db")
    resp = client.get("/search?q=rabbit")
    assert resp.status_code == 200
    assert "/chats/chat_a#turn-" in resp.text
 def test_search_results_use_batched_lookups(client, tmp_path):
    """T106: hydration must not fan out to per-row ``get_bot``/
    ``get_chat``/``get_scene`` calls.
    The previous implementation called each helper once per result row
    (worst case 50 rows x 3 helpers = 150 individual queries). The
    batched implementation collects distinct ids and issues at most one
    query per entity kind via ``WHERE id IN (...)``, so the per-row
    helpers should not be invoked at all when there are matches.
    We seed two chats (so both ``get_bot`` and ``get_chat`` would have
    been hit pre-T106) and assert each helper sees zero per-row calls.
    """
    _seed_two_chats_with_memories(tmp_path / "test.db")
    with (
        patch("chat.web.search.get_bot") as mock_get_bot,
        patch("chat.web.search.get_chat") as mock_get_chat,
        patch("chat.web.search.get_scene") as mock_get_scene,
    ):
        resp = client.get("/search?q=rabbit")
    assert resp.status_code == 200
    # Batched IN-list queries replace the per-row helpers entirely.
    assert mock_get_bot.call_count == 0
    assert mock_get_chat.call_count == 0
    assert mock_get_scene.call_count == 0
@@ -156,6 +156,28 @@ def test_restore_snapshot_wrong_confirm_400(client, tmp_path):
    assert response.status_code == 400
 def test_restore_without_kind_returns_400(client, tmp_path):
    """T105: Missing or empty ``kind`` must be rejected with 400.
    Previously ``kind`` defaulted to ``"periodic"``, which silently 404'd
    when the caller meant a rewind snapshot. Tighten the contract so the
    client must always pass an explicit, valid ``kind``.
    """
    db_path = tmp_path / "test.db"
    _seed_bot(db_path, "bot_a", "BotA")
    snapshot_path = _take_snapshot_via_service(
        db_path, tmp_path, kind="periodic"
    )
    snapshot_id = snapshot_path.stem
    response = client.post(
        f"/snapshots/restore/{snapshot_id}",
        data={"confirm_id": snapshot_id},  # no `kind`
        follow_redirects=False,
    )
    assert response.status_code == 400
 def test_preview_renders_metadata(client, tmp_path):
    db_path = tmp_path / "test.db"
    _seed_bot(db_path, "bot_a", "BotA")
@@ -22,6 +22,7 @@ from chat.db.connection import open_db
 from chat.eventlog.log import append_and_apply, append_event
 from chat.eventlog.projector import project
 from chat.llm.mock import MockLLMClient
 from tests.fixtures import CannedQueue
@pytest.fixture
@@ -362,14 +363,20 @@ def test_single_bot_turn_no_guest_regression(app_state_setup, tmp_path):
    the chat has no guest, so ``detect_interjection`` is NOT invoked.
    Ends with one user_turn, one assistant_turn, two edge_updates, and a
    single ``memory_written``.
    T116: migrated to :class:`tests.fixtures.CannedQueue` as a proof of
    concept for the structured canned-queue builder.
    """
    _seed(tmp_path / "test.db")
-    canned_parse = json.dumps(
+    canned = (
-        {"segments": [{"kind": "dialogue", "text": "hello"}]}
+        CannedQueue()
-    )
+            .parse_turn(segments=[{"kind": "dialogue", "text": "hello"}])
-    mock = _override_llm(
+            .narrative("Hi there.")
-        [canned_parse, "Hi there.", _zero_state(), _zero_state()]
+            .state_update()
            .state_update()
            .build()
    )
    mock = _override_llm(canned)
    try:
        response = app_state_setup.post(
            "/chats/chat_bot_a/turns", data={"prose": "hello"}
@@ -734,6 +741,19 @@ def test_cancelled_turn_still_closes_scene_when_user_prose_signals_close(
    that as an exception, so we drive the request inside ``with
    pytest.raises``. Despite the exception, the scene_closed event
    must land in the event_log.
    T108 NOTE — this test does NOT actually exercise the cancel path.
    ``_CancelOnStreamMock.stream`` writes ``raise asyncio.CancelledError``
    but ``asyncio`` is not imported at module scope, so the first
    iteration raises ``NameError`` (caught by ``except Exception:`` in
    post_turn, which sets ``primary_truncated=True`` but leaves
    ``cancelled=False``). The function therefore returns 204 normally,
    the dependency-managed connection commits, and ``scene_closed``
    lands. Importing asyncio so the real CancelledError fires reveals
    a transactional bug: ``post_turn``'s end-of-function re-raise
    causes ``open_db``'s dependency teardown to skip ``conn.commit()``,
    rolling back ALL post-cancel writes (user_turn, assistant_turn,
    edge_updates, scene_closed). Deferred for triage — see T108 report.
    """
    from typing import AsyncIterator, Sequence
@@ -828,12 +848,33 @@ def test_cancelled_turn_still_closes_scene_when_user_prose_signals_close(
            "SELECT payload_json FROM event_log "
            "WHERE kind = 'assistant_turn' ORDER BY id"
        ).fetchall()
        # T108: pin the ordering — user_turn must commit before
        # scene_closed (close detection runs on prose that is already
        # in the event_log) and any assistant_turn the cancel produced
        # must come last (truncated record written after both).
        ordered = conn.execute(
            "SELECT id, kind FROM event_log "
            "WHERE kind IN ('user_turn', 'scene_closed', 'assistant_turn') "
            "ORDER BY id"
        ).fetchall()
    # Scene close lands despite the cancel.
    assert scene_close_count == 1
    # The cancelled assistant_turn was still recorded (truncated=True).
    assert len(assistant_payload) == 1
    assert json.loads(assistant_payload[0][0])["truncated"] is True
    # T108 ordering pin: user_turn lands first, the truncated
    # assistant_turn (if any) is committed BEFORE the scene_close
    # decision fires, and scene_closed lands last. Close detection
    # relies on user prose being committed to the event_log BEFORE
    # the close decision runs — and the cancelled assistant beat is
    # recorded as a partial before close-detection too.
    kinds_in_order = [row[1] for row in ordered]
    user_idx = kinds_in_order.index("user_turn")
    close_idx = kinds_in_order.index("scene_closed")
    assert user_idx < close_idx
    if "assistant_turn" in kinds_in_order:
        assert user_idx < kinds_in_order.index("assistant_turn") < close_idx
 def test_interjection_enqueues_significance_job(app_state_setup, tmp_path):
@@ -945,29 +986,25 @@ def test_turn_with_event_transition_appends_started_event(
            },
        )
-    canned_parse = json.dumps(
+    # T116: migrated to :class:`tests.fixtures.CannedQueue`.
-        {"segments": [{"kind": "dialogue", "text": "they arrived"}]}
+    canned = (
-    )
+        CannedQueue()
-    canned_event_decision = json.dumps(
+            .parse_turn(segments=[{"kind": "dialogue", "text": "they arrived"}])
-        {
+            .narrative("They walk in.")
-            "transitions": [
+            .state_update()
-                {
+            .state_update()
-                    "event_id": "evt_1",
+            .detect_event_transitions(
-                    "new_status": "active",
+                [
-                    "reason": "they arrived",
+                    {
-                }
+                        "event_id": "evt_1",
-            ]
+                        "new_status": "active",
-        }
+                        "reason": "they arrived",
-    )
+                    }
-    mock = _override_llm(
+                ]
-        [
+            )
-            canned_parse,
+            .build()
            "They walk in.",
            _zero_state(),
            _zero_state(),
            canned_event_decision,
        ]
    )
    mock = _override_llm(canned)
    try:
        response = app_state_setup.post(
            "/chats/chat_bot_a/turns", data={"prose": "they arrived"}
@@ -989,6 +1026,18 @@ def test_turn_with_event_transition_appends_started_event(
        assert started_payload["event_id"] == "evt_1"
        assert started_payload["started_at"] == "2026-04-26T20:00:00+00:00"
        # T114.1: payload carries the back-reference to the assistant_turn
        # that triggered the transition. The assistant_turn lands in
        # event_log immediately before the event_started, so its id is
        # the largest assistant_turn id in the chat at this point.
        at_id = conn.execute(
            "SELECT id FROM event_log "
            "WHERE kind = 'assistant_turn' "
            "  AND json_extract(payload_json, '$.chat_id') = 'chat_bot_a' "
            "ORDER BY id DESC LIMIT 1"
        ).fetchone()[0]
        assert started_payload["triggered_by_assistant_turn_id"] == at_id
        # The events projection row reflects the active status.
        ev_row = conn.execute(
            "SELECT status, started_at FROM events WHERE event_id = ?",
@@ -1109,18 +1158,23 @@ def test_turn_with_no_active_events_skips_classifier(app_state_setup, tmp_path):
    short-circuits without an LLM call (per T52). The canned queue must
    therefore have ZERO event-detection slots — same shape as the
    Phase 2 no-guest baseline.
    T116: migrated to :class:`tests.fixtures.CannedQueue`.
    """
    _seed(tmp_path / "test.db")
    canned_parse = json.dumps(
        {"segments": [{"kind": "dialogue", "text": "hello"}]}
    )
    # Only 4 slots: parse + narrative + 2 state-updates. NO extra slot for
    # event-detection — non-existent active_events causes the helper to
    # short-circuit before pulling from the queue.
-    mock = _override_llm(
+    canned = (
-        [canned_parse, "Hi there.", _zero_state(), _zero_state()]
+        CannedQueue()
            .parse_turn(segments=[{"kind": "dialogue", "text": "hello"}])
            .narrative("Hi there.")
            .state_update()
            .state_update()
            .build()
    )
    mock = _override_llm(canned)
    try:
        response = app_state_setup.post(
            "/chats/chat_bot_a/turns", data={"prose": "hello"}
@@ -324,11 +324,11 @@ def test_get_scene_returns_none_for_missing(tmp_path):
        assert active_scene(conn, "chat_missing") is None
-def test_schema_version_after_migration_is_13(tmp_path):
+def test_schema_version_after_migration_is_14(tmp_path):
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        row = conn.execute(
            "SELECT value FROM meta WHERE key = 'schema_version'"
        ).fetchone()
-        assert int(row[0]) == 13
+        assert int(row[0]) == 14