Phase 4: vector retrieval, branching, drawer polish #6

Merged
dohertj2 merged 41 commits from phase-4 into main 2026-04-27 04:10:26 -04:00
46 changed files with 5846 additions and 73 deletions
+85
View File
@@ -287,3 +287,88 @@ New follow-ups discovered during Phase 3.5 reviews and execution. None are block
- **Scene-close-on-cancel UX revisit** (Phase 2.5 carry-over): T74.3 pinned the existing behavior; revisit if real play-testing surfaces a regression.
- **Cross-feature canned-queue brittleness**: meanwhile-scene close test required a canned response for T65's digest call after T64+T65 merge. Future close-path additions will keep extending the queue. Consider a structured fixture builder rather than positional canned arrays. NOT addressed in Phase 3.5.
- **Lifecycle-transition rollback in regenerate**: T83.4 added a warning log; actual rollback (with proper schema linkage from lifecycle event back to producing turn) is Phase 4 work.
## Phase 4 status
Phase 4 polish shipped end-to-end across 15 tasks (T88T102). Vector retrieval is functional via pure-Python cosine over a JSON-blob embeddings table (sqlite-vec deferred — host Python lacks loadable extensions). Branching is data-model + drawer UI. Surgical delete with cascade preview, hide-from-view soft delete, significance review panel, snapshot UX, and cross-chat search all surface from the drawer or top-bar. Test count grew from 343 (Phase 3.5) to ~413 (+70 new tests).
- **Wave 1 — schema + Phase 3.6 carry-overs (parallel)**:
- **T88** `embeddings` table + projector handlers (pure-Python cosine, JSON-blob storage; sqlite-vec deferred).
- **T89** `branches` table + handlers (main bootstrapped; `is_active` flag; partial unique index).
- **T90** Phase 3.6 carry-overs trio — `read_recent_dialogue` chat-id SQL pushdown, lifecycle warning wording tightening, legacy `record_turn_memory` removed.
- **Wave 2 — services (parallel)**:
- **T91** embedding generation service (Phase 4 ships a deterministic SHA-256-derived pseudo-embedding; real model swap is Phase 4.5+).
- **T92** vector search service via pure-Python cosine.
- **T93** cross-chat search service (FTS5 across all owners, no witness filter — admin-style).
- **Wave 3 — services (parallel)**:
- **T94** branching service (`branch_from_event`, `switch_active_branch`, `list_branches_with_metadata`).
- **T95** delete-impact computation service (cascade preview, no DB mutation).
- **Wave 4 — combined retrieval (single)**:
- **T96** combined FTS + vector retrieval ranking via reciprocal-rank fusion (RRF, `RRF_CONST=60`); existing significance/recency boost applied as final pass.
- **Wave 5 — memory write hook + backfill (single)**:
- **T97** `EmbeddingWorker` drains queue and emits `embedding_indexed` events; `memory_write` enqueues per `memory_written`; `backfill_embeddings` script for existing memories; ALL 4 production call sites wired (turns, regenerate, meanwhile, drawer).
- **Wave 6 — drawer Phase 4 bundle (single, 5 sub-features)**:
- **T98.1** branching UI (Branches panel + 3 routes).
- **T98.2** significance review panel (distribution bar chart + per-memory edit).
- **T98.3** hide-from-view toggle + `turn_hidden` `manual_edit` branch.
- **T98.4** surgical delete with cascade preview (reuses existing rewind path; pre-rewind snapshot preserved).
- **T98.5** remaining v1 edits — `narrative_anchor` + weather drawer affordances + 2 new `manual_edit` branches.
- **Wave 7 — UX surfaces (parallel)**:
- **T99** snapshot UX (manual trigger, list, restore with hard-confirm, preview).
- **T100** cross-chat search UX (top-bar form + results page).
- **Wave 8 — polish (parallel)**:
- **T101** cross-feature integration tests (5 multi-feature scenarios).
- **T102** documentation (this section).
### Phase 4.5 / 5 backlog
New follow-ups discovered during Phase 4 reviews and execution. None are blocking; pick up at any time.
#### From T88 review
- **`embeddings` FK lacks `ON DELETE CASCADE`**: deindex events are the only deletion path; if memories ever get deleted directly (raw SQL), embedding rows orphan. Defensible since projector model uses explicit deindex events, but worth a comment or `ON DELETE CASCADE` addition.
#### From T89 review
- **`list_branches(chat_id=...)` filter leaks global branches** (`chat_id IS NULL`) into every chat scope. Intentional? Document.
- **Branch-switch to nonexistent silently leaves zero active branches** — log a warning when this would happen.
#### From T91 review
- **Real embedding model swap**: Phase 4 ships pseudo-embedding (deterministic SHA-256 hash). Phase 4.5+ should swap to a real model (Featherless `bge-small-en-v1.5` if available; or local `sentence-transformers/all-MiniLM-L6-v2`). The 384-dim is hardcoded in `0012_embeddings.sql`; if dim changes, migrate first.
- **`timeout_s` unused on pseudo path** — fine, but log when non-default model falls through to fallback so misconfigured callers don't silently degrade.
#### From T96 review
- **Duplicate `MAX(id)` lookup** between `_composite_rerank` and the fused-path tail — DRY follow-up.
- **`fts_rank=None` for vector-only rows** — document downstream contract.
#### From T98 review
- **`event_id <= 0` guard in `delete_turn`** — currently silently rewinds everything if `event_id` is 0. Add `if event_id <= 0: 400`.
- **`html.escape()` on `compute_delete_impact` output rendered into the modal** — defense in depth (currently model-controlled strings, but if event payload fields ever appear in descriptions, autoescape needed).
- **Extract delete-impact modal HTML to a Jinja partial** — testability + autoescape inheritance.
#### From T99 review
- **Hoist `datetime`/`timezone` imports to module level** in `chat/web/snapshots.py`.
- **`kind` defaulting in restore/preview** — reject missing `kind` rather than silent 404.
- **`created_at` from file mtime** vs filename-encoded timestamp — small drift if files copied; document.
#### From T100 review
- **Hardcoded `k=50`** — extract to module constant.
- **N+1 lookups (`get_bot`/`get_chat`/`get_scene` per row)** — fine at `k=50`, revisit if `k` grows.
- **FTS highlighting via `snippet()`** — Phase 4 skipped this; UX nice-to-have.
- **Result links chat-level only** — `memories` table has no `event_id` column; deep-linking to specific turn requires schema addition.
#### Deferred items
- **sqlite-vec swap** when host Python supports `enable_load_extension`.
- **Real embedding model** with proper semantic similarity.
- **Branching read-side filter**: T89 ships data-model + UI but event readers don't yet consult `is_active`. Each branch is metadata-only labeled ranges. Consult-on-read is Phase 4.5+ work.
- **Bulk significance re-rate** in drawer (T98.2 deferred — only per-memory edit shipped).
- **Vector index optimization** (HNSW) — only relevant if memory counts grow past pure-Python feasibility.
- **`scene-close-on-cancel` UX revisit** (Phase 2.5 carry-over).
- **Cross-feature canned-queue brittleness fixture builder** (Phase 3 carry-over).
- **Full lifecycle-rollback in regenerate** — Phase 3.5 T83.4 shipped a warning log; proper rollback needs schema-level back-references (`triggered_by_assistant_turn_id` payload field).
+19
View File
@@ -16,6 +16,7 @@ from chat.db.migrate import apply_migrations
from chat.eventlog.log import read_events
from chat.eventlog.projector import apply_event
from chat.services.background import BackgroundWorker
from chat.services.embedding_worker import EmbeddingWorker
from chat.services.snapshot import latest_snapshot_path, restore_from_snapshot
# Trigger handler registration:
@@ -31,7 +32,9 @@ from chat.web.drawer import router as drawer_router
from chat.web.kickoff import router as kickoff_router
from chat.web.middleware import FirstRunRedirectMiddleware
from chat.web.nav import router as nav_router
from chat.web.search import router as search_router
from chat.web.settings import router as settings_router
from chat.web.snapshots import router as snapshots_router
from chat.web.sse import router as sse_router
from chat.web.turns import router as turns_router
@@ -85,9 +88,23 @@ async def lifespan(app: FastAPI):
await worker.start()
app.state.background_worker = worker
# T97: separate worker for the async embedding pass. Each
# ``memory_written`` enqueues an EmbeddingJob; the worker drains the
# queue, calls ``generate_embedding``, and emits ``embedding_indexed``.
# Phase 4's pseudo-embedding path is local so the worker doesn't need
# an LLM client; we still pass one so the Phase 4.5 swap to a real
# model is a one-line change.
embedding_worker = EmbeddingWorker(
conn_factory=lambda: open_db(settings.db_path),
client=_factory(),
)
await embedding_worker.start()
app.state.embedding_worker = embedding_worker
try:
yield
finally:
await embedding_worker.stop()
await worker.stop()
@@ -122,9 +139,11 @@ async def http_exception_handler(request: Request, exc: StarletteHTTPException):
app.include_router(bots_router)
app.include_router(kickoff_router)
app.include_router(settings_router)
app.include_router(snapshots_router)
app.include_router(nav_router)
app.include_router(chat_router)
app.include_router(drawer_router)
app.include_router(search_router)
app.include_router(sse_router)
app.include_router(turns_router)
+14
View File
@@ -0,0 +1,14 @@
-- Embeddings stored as JSON arrays (pure-Python cosine at query time).
-- Phase 4.5+ may swap to sqlite-vec when the host Python supports
-- loadable extensions; the schema is intentionally simple to make that
-- migration straightforward.
CREATE TABLE embeddings (
memory_id INTEGER PRIMARY KEY,
vector_json TEXT NOT NULL, -- JSON array of floats, length = dim
model TEXT NOT NULL,
dim INTEGER NOT NULL,
indexed_at TEXT NOT NULL DEFAULT (datetime('now')),
FOREIGN KEY (memory_id) REFERENCES memories(id)
);
CREATE INDEX embeddings_model_idx ON embeddings(model);
+17
View File
@@ -0,0 +1,17 @@
CREATE TABLE branches (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL UNIQUE,
origin_event_id INTEGER NOT NULL,
head_event_id INTEGER NOT NULL,
chat_id TEXT,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
is_active INTEGER NOT NULL DEFAULT 0
);
-- Exactly one row may have is_active = 1 at any time.
CREATE UNIQUE INDEX branches_active_idx ON branches(is_active) WHERE is_active = 1;
-- Bootstrap the main branch. origin_event_id=0 + head_event_id=0 are
-- placeholder seeds; the orchestrator updates head as new events land.
INSERT INTO branches (name, origin_event_id, head_event_id, is_active)
VALUES ('main', 0, 0, 1);
+107
View File
@@ -0,0 +1,107 @@
"""Branching service (T94, Phase 4).
Wraps branches state with validation + event emission. Phase 4 ships
the data model and creation/switching APIs; the read-side filter
(event readers consulting is_active) is a Phase 4.5+ follow-up — for
now branches are metadata-only and the existing event readers remain
branch-agnostic. The drawer UI (T98) drives create/switch via these
helpers.
"""
from __future__ import annotations
from sqlite3 import Connection
from chat.eventlog.log import append_and_apply
from chat.state.branches import get_branch, list_branches, active_branch # noqa: F401
def branch_from_event(
conn: Connection,
*,
name: str,
origin_event_id: int,
chat_id: str | None = None,
) -> int:
"""Create a new named branch forking from origin_event_id.
Emits a branch_created event. Returns the new branch's row id.
Raises ValueError if name already exists or origin_event_id doesn't
correspond to a real event."""
if not name or not name.strip():
raise ValueError("branch name must be non-empty")
if get_branch(conn, name) is not None:
raise ValueError(f"branch {name!r} already exists")
# Validate origin_event_id is a real event id (or 0 for the bootstrap case
# which only main uses).
if origin_event_id < 0:
raise ValueError(f"origin_event_id must be >= 0, got {origin_event_id}")
if origin_event_id > 0:
row = conn.execute(
"SELECT 1 FROM event_log WHERE id = ?", (origin_event_id,)
).fetchone()
if row is None:
raise ValueError(
f"origin_event_id {origin_event_id} does not exist in event_log"
)
append_and_apply(
conn,
kind="branch_created",
payload={
"name": name,
"origin_event_id": origin_event_id,
"head_event_id": origin_event_id, # head starts at origin
"chat_id": chat_id,
},
)
branch = get_branch(conn, name)
if branch is None:
# Should be unreachable if append_and_apply worked.
raise RuntimeError(f"branch {name!r} not found after creation")
return branch["id"]
def switch_active_branch(conn: Connection, *, name: str) -> None:
"""Make the named branch active. Emits branch_switched."""
if get_branch(conn, name) is None:
raise ValueError(f"branch {name!r} does not exist")
append_and_apply(
conn,
kind="branch_switched",
payload={"name": name},
)
def list_branches_with_metadata(
conn: Connection, chat_id: str | None = None
) -> list[dict]:
"""List branches with computed event_count metadata.
event_count = head_event_id - origin_event_id + 1 (when both are set)
OR head_event_id (when origin is 0, e.g., main branch)
OR 0 (when head <= origin, which is the bootstrap state)
"""
branches = list_branches(conn, chat_id)
enriched = []
for b in branches:
origin = b["origin_event_id"]
head = b["head_event_id"]
if head < origin:
event_count = 0
elif origin == 0:
event_count = head
else:
event_count = head - origin + 1
enriched.append({**b, "event_count": event_count})
return enriched
__all__ = [
"branch_from_event",
"switch_active_branch",
"list_branches_with_metadata",
]
+75
View File
@@ -0,0 +1,75 @@
"""Cross-chat search service (T93, Phase 4).
FTS5-based search across ALL owners and ALL chats. Used by the
top-bar search UX (T100) for "where did I last see this character
mention X?" queries. NO witness filter -- this is intentionally a
power-user surface that surfaces memories across POVs.
Mirrors the FTS5 access pattern of ``chat.state.memory.search_memories``
but drops both the ``owner_id = ?`` and the per-witness predicates so a
single query can sweep every chat in the database. The composite
re-rank is also dropped: callers want raw BM25 ordering for the
"highest match strength wins" semantics expected of a global search box.
"""
from __future__ import annotations
from sqlite3 import Connection
def search_all_memories(
conn: Connection,
*,
query: str,
k: int = 20,
) -> list[dict]:
"""Search FTS5 across all owners and chats.
Returns rows with ``{memory_id, owner_id, chat_id, scene_id,
pov_summary, significance, ts, fts_rank}``, sorted by FTS5 BM25
rank ascending (lower rank = stronger match, surfaced first).
The ``memories`` table has no ``ts`` column; we expose ``created_at``
(the projector-side row insertion timestamp) under that key so the
UI does not have to know the storage name.
An empty / whitespace-only ``query`` short-circuits to ``[]`` to
avoid an FTS5 ``MATCH ''`` syntax error and to keep the top-bar
"no input yet" state from triggering a full-table scan.
"""
if not query or not query.strip():
return []
# FTS5 MATCH against the same ``memories_fts`` virtual table that
# backs ``state.memory.search_memories``; the JOIN pulls metadata
# from the content table because the FTS index only stores
# ``pov_summary``. ORDER BY rank ASC because BM25 in FTS5 returns
# negative scores where lower is better.
rows = conn.execute(
"SELECT m.id, m.owner_id, m.chat_id, m.scene_id, "
" m.pov_summary, m.significance, m.created_at, "
" memories_fts.rank "
"FROM memories_fts "
"JOIN memories m ON m.id = memories_fts.rowid "
"WHERE memories_fts MATCH ? "
"ORDER BY memories_fts.rank ASC "
"LIMIT ?",
(query.strip(), k),
).fetchall()
return [
{
"memory_id": r[0],
"owner_id": r[1],
"chat_id": r[2],
"scene_id": r[3],
"pov_summary": r[4],
"significance": r[5],
"ts": r[6],
"fts_rank": r[7],
}
for r in rows
]
__all__ = ["search_all_memories"]
+147
View File
@@ -0,0 +1,147 @@
"""Delete-impact computation service (T95, Phase 4).
Walks event_log forward from a target event_id and produces an ImpactReport
describing what would be removed if rewind-to-target were invoked. Pure
computation — does NOT mutate the database. Used by T98's drawer surgical-
delete UI to render an 'are you sure?' modal before invoking the actual
rewind path (chat/services/rewind.py).
"""
from __future__ import annotations
import json
from sqlite3 import Connection
from pydantic import BaseModel, Field
class DeletedItem(BaseModel):
kind: str
description: str
target_id: int | str | None = None
class ImpactReport(BaseModel):
target_event_id: int
cascading: list[DeletedItem] = Field(default_factory=list)
notes: list[str] = Field(default_factory=list)
def _excerpt(text: str, n: int = 60) -> str:
text = (text or "").strip().replace("\n", " ")
return text if len(text) <= n else text[: n - 1] + ""
def compute_delete_impact(
conn: Connection,
*,
target_event_id: int,
) -> ImpactReport:
"""Compute the cascading impact of rewinding to target_event_id."""
# Verify target exists.
target_row = conn.execute(
"SELECT id, kind, payload_json FROM event_log WHERE id = ?",
(target_event_id,),
).fetchone()
if target_row is None:
return ImpactReport(
target_event_id=target_event_id,
cascading=[],
notes=[f"target event_id {target_event_id} not found"],
)
# Walk forward: every event with id >= target_event_id is in scope.
rows = conn.execute(
"SELECT id, kind, payload_json FROM event_log "
"WHERE id >= ? ORDER BY id ASC",
(target_event_id,),
).fetchall()
cascading: list[DeletedItem] = []
notes: list[str] = []
scene_close_present = False
regenerated_from = None
for row_id, kind, payload_json in rows:
try:
payload = json.loads(payload_json) if payload_json else {}
except (json.JSONDecodeError, TypeError):
payload = {}
if kind == "memory_written":
cascading.append(
DeletedItem(
kind=kind,
description=f"memory: {_excerpt(payload.get('pov_summary', ''))}",
target_id=payload.get("memory_id"),
)
)
elif kind == "edge_update":
src = payload.get("source_id", "?")
tgt = payload.get("target_id", "?")
cascading.append(
DeletedItem(
kind=kind,
description=f"edge update: {src} -> {tgt}",
target_id=f"{src}->{tgt}",
)
)
elif kind == "scene_closed":
scene_close_present = True
cascading.append(
DeletedItem(
kind=kind,
description=f"scene close at {payload.get('closed_at', '?')}",
target_id=payload.get("scene_id"),
)
)
elif kind in ("user_turn", "user_turn_edit", "assistant_turn"):
speaker = payload.get("speaker_id") or ("you" if kind.startswith("user") else "?")
prose = payload.get("prose") or payload.get("text") or ""
cascading.append(
DeletedItem(
kind=kind,
description=f"turn {row_id} ({speaker}: {_excerpt(prose, 50)})",
target_id=row_id,
)
)
if regenerated_from is None and payload.get("regenerated_from"):
regenerated_from = payload["regenerated_from"]
elif kind == "manual_edit":
target_kind = payload.get("target_kind", "?")
cascading.append(
DeletedItem(
kind=kind,
description=f"manual edit: {target_kind}",
target_id=payload.get("target_id"),
)
)
else:
cascading.append(
DeletedItem(
kind=kind,
description=f"{kind} event",
target_id=row_id,
)
)
# Notes / warnings.
notes.append(f"{len(rows)} events would be discarded total")
if scene_close_present:
notes.append(
"scene close events are in scope — closing-scene per-POV summaries "
"and group_node updates will be reverted"
)
if regenerated_from is not None:
notes.append(
f"target turn was regenerated from event_id {regenerated_from}; "
f"the original turn remains intact"
)
return ImpactReport(
target_event_id=target_event_id,
cascading=cascading,
notes=notes,
)
__all__ = ["DeletedItem", "ImpactReport", "compute_delete_impact"]
+137
View File
@@ -0,0 +1,137 @@
"""Embedding worker (T97, Phase 4).
Drains a queue of embedding jobs. Each job carries a memory id and the
narrative text to embed; the worker calls
:func:`chat.services.embeddings.generate_embedding` and emits an
``embedding_indexed`` event so the projector lands the vector in the
``embeddings`` table.
Mirrors the :class:`chat.services.background.BackgroundWorker` pattern:
single asyncio task, sentinel-based shutdown, exceptions are caught and
logged so a flaky embedding call doesn't take down the worker. Each job
opens its own SQLite connection via ``conn_factory`` — the request path
and the worker do not share connections.
Featherless concurrency (the 2-conn cap) is respected by virtue of the
single-task design: jobs run strictly serially. Phase 4's pseudo-embedding
path is local and synchronous so this is largely moot, but the pattern
is in place for the Phase 4.5+ real-embedding swap.
"""
from __future__ import annotations
import asyncio
import logging
from dataclasses import dataclass
from sqlite3 import Connection
from typing import Callable
from chat.eventlog.log import append_and_apply
from chat.services.embeddings import (
DEFAULT_EMBEDDING_DIM,
DEFAULT_EMBEDDING_MODEL,
FALLBACK_EMBEDDING_MODEL,
generate_embedding,
)
log = logging.getLogger(__name__)
@dataclass
class EmbeddingJob:
"""One unit of work for the embedding worker.
``memory_id`` is the row to attach the vector to; ``text`` is the
narrative text to embed (typically ``memories.pov_summary``).
"""
memory_id: int
text: str
class EmbeddingWorker:
"""asyncio.Queue-backed single-worker task for embedding generation.
Started on app startup; ``stop()`` enqueues a sentinel and awaits
the task so any in-flight job has a chance to finish. Pending jobs
after the sentinel are dropped on shutdown.
"""
def __init__(
self,
*,
conn_factory: Callable[[], Connection],
client, # LLMClient | None — unused on the pseudo path.
model: str = DEFAULT_EMBEDDING_MODEL,
dim: int = DEFAULT_EMBEDDING_DIM,
enabled: bool = True,
) -> None:
self._queue: asyncio.Queue[EmbeddingJob | None] = asyncio.Queue()
self._conn_factory = conn_factory
self._client = client
self._model = model
self._dim = dim
self._task: asyncio.Task | None = None
self.enabled = enabled
def enqueue(self, job: EmbeddingJob) -> None:
if not self.enabled:
return
self._queue.put_nowait(job)
async def start(self) -> None:
if self._task is None:
self._task = asyncio.create_task(self._run())
async def stop(self) -> None:
if self._task is None:
return
self._queue.put_nowait(None) # sentinel
await self._task
self._task = None
async def _run(self) -> None:
while True:
job = await self._queue.get()
if job is None:
return
try:
await self._process(job)
except Exception as exc: # noqa: BLE001 — worker must not die
log.warning(
"embedding worker failed for memory_id=%s: %s",
job.memory_id,
exc,
exc_info=True,
)
async def _process(self, job: EmbeddingJob) -> None:
result = await generate_embedding(
self._client,
text=job.text,
model=self._model,
dim=self._dim,
)
if result.model == FALLBACK_EMBEDDING_MODEL:
# Don't index a fallback (zero) vector — the backfill script
# can retry later once a real embedding is available.
log.debug(
"embedding worker skipping fallback result for memory_id=%s",
job.memory_id,
)
return
with self._conn_factory() as conn:
append_and_apply(
conn,
kind="embedding_indexed",
payload={
"memory_id": job.memory_id,
"model": result.model,
"dim": result.dim,
"vector": result.vector,
},
)
__all__ = ["EmbeddingJob", "EmbeddingWorker"]
+108
View File
@@ -0,0 +1,108 @@
"""Embedding generation service (T91, Phase 4).
Wraps the embedding API call. For Phase 4's first cut we ship a
deterministic local pseudo-embedding (hash-derived) so the vector
retrieval pipeline can land without an external embedding endpoint
or heavy local dependency. Phase 4.5+ swaps to a real model — the
EmbeddingResult shape stays the same, only the generator changes.
"""
from __future__ import annotations
import hashlib
import math
import struct
from pydantic import BaseModel
from chat.llm.client import LLMClient
DEFAULT_EMBEDDING_DIM = 384
DEFAULT_EMBEDDING_MODEL = "pseudo-sha256-384"
FALLBACK_EMBEDDING_MODEL = "fallback"
class EmbeddingResult(BaseModel):
vector: list[float]
model: str
dim: int
def _pseudo_embed(text: str, dim: int = DEFAULT_EMBEDDING_DIM) -> list[float]:
"""Deterministic pseudo-embedding for Phase 4 first cut.
Hashes the text with SHA-256, then expands by re-hashing each
successive block with the previous block + a counter — this gives
``dim * 4`` bytes of fresh entropy per input rather than naively
repeating the 32-byte digest (which would collapse the vector onto
only 8 unique floats and make distinct inputs cosine-similar).
Bytes are unpacked as little-endian int32s and rescaled to [-1, 1]
so we sidestep the float32 NaN/denormal values that ``struct.unpack
'f'`` would otherwise produce on raw hash bytes. The result is
unit-normalized so cosine similarity reduces to a dot product.
NOT semantically meaningful — just consistent for testing the
pipeline. Phase 4.5 should swap to a real embedding model.
"""
needed = dim * 4 # 4 bytes per int32
seed = text.encode("utf-8")
chunks: list[bytes] = []
counter = 0
while sum(len(c) for c in chunks) < needed:
block = hashlib.sha256(seed + counter.to_bytes(4, "big")).digest()
chunks.append(block)
counter += 1
full = b"".join(chunks)[:needed]
ints = struct.unpack(f"<{dim}i", full)
# Map int32 to roughly [-1, 1] — exact bound doesn't matter since we
# normalize, but keeps values numerically tame.
raw = [x / 2147483648.0 for x in ints]
norm = math.sqrt(sum(x * x for x in raw)) or 1.0
return [x / norm for x in raw]
async def generate_embedding(
client: LLMClient,
*,
text: str,
model: str = DEFAULT_EMBEDDING_MODEL,
dim: int = DEFAULT_EMBEDDING_DIM,
timeout_s: float = 30.0,
) -> EmbeddingResult:
"""Generate an embedding for the given text.
Phase 4 default uses a deterministic local pseudo-embedding. If
the LLMClient grows an ``embed(...)`` method in Phase 4.5, this
wrapper will route to it when ``model != "pseudo-sha256-384"``.
Falls back to a zero vector with ``model="fallback"`` on any
failure (callers detect the sentinel and skip indexing). For the
pseudo path, failure is structurally impossible — it's pure local
computation.
"""
if not text or not text.strip():
# Empty input — return fallback so caller doesn't index empty rows.
return EmbeddingResult(
vector=[0.0] * dim, model=FALLBACK_EMBEDDING_MODEL, dim=dim
)
if model == DEFAULT_EMBEDDING_MODEL:
# Pure-local pseudo path — no LLMClient call.
return EmbeddingResult(vector=_pseudo_embed(text, dim), model=model, dim=dim)
# Future: real embedding via client.embed(...). Phase 4.5 work.
# For Phase 4, any non-default model falls through to fallback.
return EmbeddingResult(
vector=[0.0] * dim, model=FALLBACK_EMBEDDING_MODEL, dim=dim
)
__all__ = [
"DEFAULT_EMBEDDING_DIM",
"DEFAULT_EMBEDDING_MODEL",
"FALLBACK_EMBEDDING_MODEL",
"EmbeddingResult",
"generate_embedding",
]
+42 -57
View File
@@ -13,6 +13,14 @@ Phase 1 simplifications (per plan §11.1, T27 will refine):
pass overwrites via a follow-up event.
- Witness flags are hard-coded ``[you=1, host=1, guest=0]``. Phase 2 will
derive them from ``chat.guest_bot_id`` once a guest can be present.
T97 (Phase 4): each successful memory write also enqueues an
:class:`~chat.services.embedding_worker.EmbeddingJob` on the
lifespan-managed embedding worker, so the just-written memory gets a
vector indexed out-of-band. The hook is opt-in via the ``app`` kwarg —
callers without a FastAPI app handle (e.g. one-off scripts, isolated
unit tests) simply don't enqueue, and the backfill script can pick up
those rows later.
"""
from __future__ import annotations
@@ -20,62 +28,7 @@ from __future__ import annotations
from sqlite3 import Connection
from chat.eventlog.log import append_and_apply
def record_turn_memory(
conn: Connection,
*,
chat_id: str,
host_bot_id: str,
narrative_text: str,
scene_id: int | None = None,
chat_clock_at: str | None = None,
source: str = "direct",
significance: int = 1,
) -> tuple[int, int | None]:
"""Append a ``memory_written`` event for the host bot's POV of this turn.
Uses :func:`chat.eventlog.log.append_and_apply` (not raw
:func:`append_event`) so the new memory row is projected immediately
without re-running prior non-idempotent handlers (e.g. ``edge_update``
deltas).
Returns ``(event_id, memory_id)``. ``event_id`` is the row id of the
just-appended ``memory_written`` event in ``event_log``. ``memory_id``
is the autoincrement PK of the corresponding ``memories`` row — these
are *different* numbers (event_log and memories use independent
rowid sequences) so callers needing to update significance or pin
state must use ``memory_id``. Falls back to ``None`` if the projected
row can't be located, which shouldn't happen but keeps the return
shape stable.
"""
payload: dict = {
"owner_id": host_bot_id,
"chat_id": chat_id,
"pov_summary": narrative_text,
"witness_you": 1,
"witness_host": 1,
"witness_guest": 0,
"source": source,
"reliability": 1.0,
"significance": significance,
"pinned": 0,
"auto_pinned": 0,
}
if scene_id is not None:
payload["scene_id"] = scene_id
if chat_clock_at is not None:
payload["chat_clock_at"] = chat_clock_at
event_id = append_and_apply(conn, kind="memory_written", payload=payload)
row = conn.execute(
"SELECT id FROM memories "
"WHERE owner_id = ? AND chat_id = ? "
"ORDER BY id DESC LIMIT 1",
(host_bot_id, chat_id),
).fetchone()
memory_id = row[0] if row else None
return event_id, memory_id
from chat.services.embedding_worker import EmbeddingJob
def _write_one_memory(
@@ -91,9 +44,16 @@ def _write_one_memory(
chat_clock_at: str | None,
source: str,
significance: int,
app=None,
) -> tuple[int, int | None]:
"""Append a single ``memory_written`` event for ``owner_id`` and return
``(event_id, memory_id)`` for the projected row."""
``(event_id, memory_id)`` for the projected row.
When ``app`` is provided and ``app.state.embedding_worker`` exists,
enqueue an :class:`EmbeddingJob` for the freshly-projected memory id
(T97). Skipped silently if the worker is absent or the projected row
can't be located — the backfill script handles missing-vector rows.
"""
payload: dict = {
"owner_id": owner_id,
"chat_id": chat_id,
@@ -120,6 +80,23 @@ def _write_one_memory(
(owner_id, chat_id),
).fetchone()
memory_id = row[0] if row else None
# T97: enqueue an embedding job for the just-written memory. The
# worker drains the queue out-of-band and emits an
# ``embedding_indexed`` event when the vector is ready. ``getattr``
# keeps this a no-op for callers without a wired-up app (scripts,
# tests) — the backfill script handles those rows.
if memory_id is not None and narrative_text and narrative_text.strip():
worker = (
getattr(app.state, "embedding_worker", None)
if app is not None
else None
)
if worker is not None:
worker.enqueue(
EmbeddingJob(memory_id=memory_id, text=narrative_text)
)
return event_id, memory_id
@@ -135,6 +112,7 @@ def record_turn_memory_for_present(
source: str = "direct",
significance: int = 1,
you_present: bool = True,
app=None,
) -> dict[str, tuple[int, int | None]]:
"""Single entry-point for per-turn memory writes (T84).
@@ -153,6 +131,9 @@ def record_turn_memory_for_present(
with ``you_present=False`` is a programming error and raises
:class:`ValueError`.
When ``app`` is provided, each per-witness write also enqueues an
:class:`EmbeddingJob` on ``app.state.embedding_worker`` (T97).
Returns a mapping ``{bot_id: (event_id, memory_id)}`` so callers can
look up the freshly-projected memory id per owner without re-querying
the database.
@@ -177,6 +158,7 @@ def record_turn_memory_for_present(
chat_clock_at=chat_clock_at,
source=source,
significance=significance,
app=app,
)
if guest_bot_id is not None:
result[guest_bot_id] = _write_one_memory(
@@ -191,6 +173,7 @@ def record_turn_memory_for_present(
chat_clock_at=chat_clock_at,
source=source,
significance=significance,
app=app,
)
return result
@@ -206,6 +189,7 @@ def record_meanwhile_memory(
chat_clock_at: str | None = None,
source: str = "direct",
significance: int = 1,
app=None,
) -> dict[str, tuple[int, int | None]]:
"""Backward-compat thin wrapper for meanwhile memory writes (T64, T84).
@@ -225,4 +209,5 @@ def record_meanwhile_memory(
source=source,
significance=significance,
you_present=False,
app=app,
)
+9 -2
View File
@@ -103,6 +103,7 @@ async def regenerate_assistant_turn(
chat_id: str,
original_assistant_event_id: int,
edited_user_prose: str | None = None,
app=None,
) -> str:
"""Regenerate the assistant turn linked to ``original_assistant_event_id``.
@@ -182,9 +183,13 @@ async def regenerate_assistant_turn(
(chat_id, original_assistant_event_id),
).fetchall()
if unrolled_lifecycle:
# T90.2: phrased as "at-or-after turn <id>" rather than "from
# superseded turn" because regenerating an OLDER turn lists
# intervening-turn transitions that legitimately stand on their
# own — those weren't authored by the superseded turn itself.
_log.warning(
"regenerate_assistant_turn: %d lifecycle transition(s) from "
"superseded turn %s are NOT being rolled back (Phase 4 "
"regenerate_assistant_turn: %d lifecycle transition(s) "
"at-or-after turn %s are NOT being rolled back (Phase 4 "
"follow-up). Affected event ids: %s",
len(unrolled_lifecycle),
original_assistant_event_id,
@@ -410,6 +415,7 @@ async def regenerate_assistant_turn(
narrative_text=new_text,
scene_id=scene["id"] if scene else None,
chat_clock_at=chat.get("time"),
app=app,
)
last_at = chat.get("time")
@@ -644,6 +650,7 @@ async def regenerate_assistant_turn(
narrative_text=interject_text,
scene_id=scene["id"] if scene else None,
chat_clock_at=chat.get("time"),
app=app,
)
# Re-run the multi-pair state-update with the post-interjection
+10 -4
View File
@@ -54,14 +54,21 @@ def read_recent_dialogue(
regenerate to drop the original assistant_turn from its prompt
context window before that row has been marked superseded (the
supersede UPDATE lands at the end so the new event_id is known).
T90.1: the chat_id filter is pushed into SQL via ``json_extract`` so
``LIMIT N`` always returns N rows scoped to the requested chat. The
previous implementation filtered chat_id post-fetch in Python, which
let foreign-chat rows fill the LIMIT and yield fewer than N relevant
rows in busy multi-chat databases.
"""
if exclude_event_id is None:
cur = conn.execute(
"SELECT id, kind, payload_json FROM event_log "
"WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
" AND superseded_by IS NULL AND hidden = 0 "
" AND json_extract(payload_json, '$.chat_id') = ? "
"ORDER BY id DESC LIMIT ?",
(limit,),
(chat_id, limit),
)
else:
cur = conn.execute(
@@ -69,15 +76,14 @@ def read_recent_dialogue(
"WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
" AND id != ? "
" AND superseded_by IS NULL AND hidden = 0 "
" AND json_extract(payload_json, '$.chat_id') = ? "
"ORDER BY id DESC LIMIT ?",
(exclude_event_id, limit),
(exclude_event_id, chat_id, limit),
)
rows = list(reversed(cur.fetchall()))
out: list[dict] = []
for row_id, kind, payload_json in rows:
p = json.loads(payload_json)
if p.get("chat_id") != chat_id:
continue
if kind in ("user_turn", "user_turn_edit"):
out.append(
{
+79
View File
@@ -0,0 +1,79 @@
"""Vector search service (T92, Phase 4).
Pure-Python cosine similarity over the embeddings table. Phase 4 ships
this without sqlite-vec because the host Python build doesn't support
loadable extensions. For single-user scale (< few thousand memories
per owner), iterating in Python is sub-millisecond.
Phase 4.5+ may swap to sqlite-vec when the host Python supports
enable_load_extension; the public API stays stable.
"""
from __future__ import annotations
import math
from sqlite3 import Connection
from chat.state.embeddings import list_embeddings_for_owner
_VALID_WITNESS_ROLES = {"you", "host", "guest"}
def _cosine_similarity(a: list[float], b: list[float]) -> float:
"""Cosine similarity. Assumes both vectors are non-zero."""
if len(a) != len(b):
return 0.0
dot = sum(x * y for x, y in zip(a, b))
norm_a = math.sqrt(sum(x * x for x in a)) or 1.0
norm_b = math.sqrt(sum(x * x for x in b)) or 1.0
return dot / (norm_a * norm_b)
def vector_search(
conn: Connection,
*,
owner_id: str,
witness_role: str, # "you" | "host" | "guest"
query_vector: list[float],
k: int = 4,
) -> list[dict]:
"""Return top-K memories by cosine similarity to query_vector,
witness-filtered for the viewer's POV. Returns rows with
{memory_id, pov_summary, significance, score} sorted by score
DESC. Empty list if no embeddings indexed for this owner.
"""
if witness_role not in _VALID_WITNESS_ROLES:
raise ValueError(
f"witness_role must be one of {_VALID_WITNESS_ROLES}, got {witness_role!r}"
)
rows = list_embeddings_for_owner(conn, owner_id)
if not rows:
return []
# Witness-filter by the requesting role.
witness_key = f"witness_{witness_role}"
filtered = [r for r in rows if r.get(witness_key) == 1]
if not filtered:
return []
scored: list[tuple[float, dict]] = []
for row in filtered:
score = _cosine_similarity(query_vector, row["vector"])
scored.append(
(
score,
{
"memory_id": row["memory_id"],
"pov_summary": row["pov_summary"],
"significance": row["significance"],
"score": score,
},
)
)
scored.sort(key=lambda t: t[0], reverse=True)
return [item for _, item in scored[:k]]
__all__ = ["vector_search"]
+133
View File
@@ -0,0 +1,133 @@
"""Branches projector + readers (T89, Phase 4).
A branch is a named fork of the event log. The 'main' branch is bootstrapped
by migration 0013 with is_active=1. Subsequent branches reference an
origin_event_id (the event they forked from). Phase 4 enables creation
and switching; the read-side filter (event readers consulting is_active)
is a Phase 4.5 follow-up — for now branches are metadata-only and the
existing event readers remain branch-agnostic.
"""
from __future__ import annotations
from sqlite3 import Connection
from chat.eventlog.projector import on
from chat.eventlog.log import Event
@on("branch_created")
def _apply_branch_created(conn: Connection, e: Event) -> None:
"""Insert a new branch row with is_active=0. Idempotent via INSERT OR IGNORE."""
p = e.payload
conn.execute(
"INSERT OR IGNORE INTO branches "
"(name, origin_event_id, head_event_id, chat_id, is_active) "
"VALUES (?, ?, ?, ?, 0)",
(
p["name"],
int(p["origin_event_id"]),
int(p.get("head_event_id", p["origin_event_id"])),
p.get("chat_id"),
),
)
@on("branch_switched")
def _apply_branch_switched(conn: Connection, e: Event) -> None:
"""Set is_active=1 on the named branch and is_active=0 on all others.
Atomic via two UPDATEs ordered to avoid the unique-active-index race.
"""
p = e.payload
name = p["name"]
# Clear ALL is_active flags first (avoids the unique-index trip).
conn.execute("UPDATE branches SET is_active = 0 WHERE is_active = 1")
conn.execute(
"UPDATE branches SET is_active = 1 WHERE name = ?",
(name,),
)
@on("branch_head_updated")
def _apply_branch_head_updated(conn: Connection, e: Event) -> None:
"""Update head_event_id on the named branch."""
p = e.payload
conn.execute(
"UPDATE branches SET head_event_id = ? WHERE name = ?",
(int(p["head_event_id"]), p["name"]),
)
def get_branch(conn: Connection, name: str) -> dict | None:
row = conn.execute(
"SELECT id, name, origin_event_id, head_event_id, chat_id, "
" created_at, is_active "
"FROM branches WHERE name = ?",
(name,),
).fetchone()
if not row:
return None
return {
"id": row[0],
"name": row[1],
"origin_event_id": row[2],
"head_event_id": row[3],
"chat_id": row[4],
"created_at": row[5],
"is_active": bool(row[6]),
}
def list_branches(conn: Connection, chat_id: str | None = None) -> list[dict]:
if chat_id is None:
rows = conn.execute(
"SELECT id, name, origin_event_id, head_event_id, chat_id, "
" created_at, is_active "
"FROM branches ORDER BY id ASC"
).fetchall()
else:
rows = conn.execute(
"SELECT id, name, origin_event_id, head_event_id, chat_id, "
" created_at, is_active "
"FROM branches WHERE chat_id = ? OR chat_id IS NULL "
"ORDER BY id ASC",
(chat_id,),
).fetchall()
return [
{
"id": r[0],
"name": r[1],
"origin_event_id": r[2],
"head_event_id": r[3],
"chat_id": r[4],
"created_at": r[5],
"is_active": bool(r[6]),
}
for r in rows
]
def active_branch(conn: Connection) -> dict | None:
row = conn.execute(
"SELECT id, name, origin_event_id, head_event_id, chat_id, "
" created_at, is_active "
"FROM branches WHERE is_active = 1"
).fetchone()
if not row:
return None
return {
"id": row[0],
"name": row[1],
"origin_event_id": row[2],
"head_event_id": row[3],
"chat_id": row[4],
"created_at": row[5],
"is_active": bool(row[6]),
}
__all__ = [
"get_branch",
"list_branches",
"active_branch",
]
+105
View File
@@ -0,0 +1,105 @@
"""Embeddings projector + readers (T88, Phase 4).
Embeddings are stored as JSON-serialized float arrays in a regular
SQLite table. Cosine similarity is computed in Python at query time
(see chat/services/vector_search.py / T92). This deliberately avoids
the sqlite-vec extension dependency — the host Python build doesn't
support enable_load_extension. Phase 4.5+ may revisit if memory counts
grow beyond pure-Python feasibility (~few thousand per query).
"""
from __future__ import annotations
import json
from sqlite3 import Connection
from chat.eventlog.projector import on
from chat.eventlog.log import Event
@on("embedding_indexed")
def _apply_embedding_indexed(conn: Connection, e: Event) -> None:
"""Insert or replace the embedding for a memory.
Idempotent: re-projection or re-indexing replaces the prior vector.
"""
p = e.payload
vector = p["vector"]
conn.execute(
"INSERT OR REPLACE INTO embeddings "
"(memory_id, vector_json, model, dim, indexed_at) "
"VALUES (?, ?, ?, ?, datetime('now'))",
(
int(p["memory_id"]),
json.dumps(list(vector)),
p["model"],
int(p.get("dim") or len(vector)),
),
)
@on("embedding_deindexed")
def _apply_embedding_deindexed(conn: Connection, e: Event) -> None:
"""Remove the embedding for a memory (used by reset cascade)."""
p = e.payload
conn.execute(
"DELETE FROM embeddings WHERE memory_id = ?",
(int(p["memory_id"]),),
)
def get_embedding(conn: Connection, memory_id: int) -> dict | None:
row = conn.execute(
"SELECT memory_id, vector_json, model, dim, indexed_at "
"FROM embeddings WHERE memory_id = ?",
(memory_id,),
).fetchone()
if not row:
return None
return {
"memory_id": row[0],
"vector": json.loads(row[1]),
"model": row[2],
"dim": row[3],
"indexed_at": row[4],
}
def list_embeddings_for_owner(conn: Connection, owner_id: str) -> list[dict]:
"""Return all embeddings for memories owned by ``owner_id``.
Used by vector search at query time (T92). The join carries the
fields the cosine ranker needs to assemble result rows without a
second round-trip: the POV summary text, significance, and witness
flags. The ``memories`` table has no separate ``text`` column —
``pov_summary`` is the canonical narrative text per
``chat/services/memory_write.py``.
"""
rows = conn.execute(
"SELECT e.memory_id, e.vector_json, e.model, e.dim, "
" m.pov_summary, m.significance, "
" m.witness_you, m.witness_host, m.witness_guest "
"FROM embeddings e "
"JOIN memories m ON m.id = e.memory_id "
"WHERE m.owner_id = ?",
(owner_id,),
).fetchall()
return [
{
"memory_id": r[0],
"vector": json.loads(r[1]),
"model": r[2],
"dim": r[3],
"pov_summary": r[4],
"significance": r[5],
"witness_you": r[6],
"witness_host": r[7],
"witness_guest": r[8],
}
for r in rows
]
__all__ = [
"get_embedding",
"list_embeddings_for_owner",
]
+38
View File
@@ -30,6 +30,20 @@ T72.3 adds a per-flag witness toggle:
``{"flag": "you"|"host"|"guest", "value": 0|1}`` and ``prior_value``
mirrors the same shape so an inverse edit can restore the flag.
T98.3 adds a hide-from-view toggle:
- ``turn_hidden`` — flip ``event_log.hidden`` on a single turn row.
Hidden turns are filtered by ``read_recent_dialogue`` (see
:mod:`chat.services.turn_common`) so they vanish from the prompt
without being deleted from the log. ``target_id`` is the integer
``event_log.id`` of the turn; ``new_value`` is ``{"hidden": 0|1}``
and ``prior_value`` mirrors the shape so an inverse edit restores it.
T98.5 finishes the v1 drawer surface with two chat-scope text edits:
- ``chat_narrative_anchor`` and ``chat_weather`` — string overwrites of
the matching ``chat_state`` columns. ``target_id`` is the chat id
(``chats.id``); ``new_value`` is the new string and ``prior_value``
carries the previous content for §6.4 reversibility.
Pin toggles intentionally use the existing ``memory_pin_changed`` event
(registered in :mod:`chat.state.memory`) rather than ``manual_edit`` so
the projection writes both ``pinned`` and ``auto_pinned`` atomically.
@@ -138,5 +152,29 @@ def _apply_manual_edit(conn: Connection, e: Event) -> None:
f"UPDATE memories SET witness_{flag} = ? WHERE id = ?",
(1 if int(new_value["value"]) else 0, int(target_id)),
)
elif kind == "turn_hidden":
# T98.3: hide-from-view toggle on a turn (event_log row). Sets
# ``event_log.hidden`` so :func:`read_recent_dialogue` (which
# filters ``hidden = 0``) drops the row from the prompt window
# without deleting it from the log. ``new_value`` is
# ``{"hidden": 0|1}``.
hidden_int = 1 if int(new_value.get("hidden", 0)) else 0
conn.execute(
"UPDATE event_log SET hidden = ? WHERE id = ?",
(hidden_int, int(target_id)),
)
elif kind == "chat_narrative_anchor":
# T98.5: string overwrite of ``chat_state.narrative_anchor`` for
# the chat keyed by ``target_id``.
conn.execute(
"UPDATE chat_state SET narrative_anchor = ? WHERE chat_id = ?",
(str(new_value), str(target_id)),
)
elif kind == "chat_weather":
# T98.5: string overwrite of ``chat_state.weather``.
conn.execute(
"UPDATE chat_state SET weather = ? WHERE chat_id = ?",
(str(new_value), str(target_id)),
)
# Unknown target_kind: silently no-op for v1. Future kinds (activity
# fields, etc.) extend the dispatch above.
+188 -5
View File
@@ -102,6 +102,15 @@ _RECENCY_WEIGHT = 0.5
# a higher-is-better score by a positive constant per the spec wording.
SIGNIFICANCE_RANK_BIAS = 0.5
# T96 (Phase 4): reciprocal-rank-fusion constant used when ``search_memories``
# is given a ``query_vector`` and must merge FTS + vector candidate lists. The
# value 60 is the canonical RRF constant from Cormack et al. ("Reciprocal Rank
# Fusion outperforms Condorcet and Individual Rank Learning Methods", SIGIR
# 2009): large enough to dampen the head of either ranking so that a strong
# top-1 in ranking A doesn't crowd out a moderate top-3 in ranking B, but
# small enough that the position-1/position-2 gap still matters.
RRF_CONST = 60
def search_memories(
conn: Connection,
@@ -109,6 +118,8 @@ def search_memories(
witness_role: str,
query: str,
k: int = 4,
*,
query_vector: list[float] | None = None,
) -> list[dict]:
"""FTS5 search over pov_summary, scoped by owner and witness role.
@@ -135,6 +146,23 @@ def search_memories(
* **Python-side** — a composite re-rank with ``_SIGNIFICANCE_WEIGHT``
reinforces the ordering after candidate retrieval, alongside the
recency boost above.
PHASE 4 EXTENSION (T96): when ``query_vector`` is provided, fuses FTS and
vector hits via reciprocal-rank fusion (RRF):
fusion_score = 1/(RRF_CONST + fts_rank) + 1/(RRF_CONST + vec_rank)
where ``fts_rank`` and ``vec_rank`` are the 0-indexed positions of the
memory in each candidate list. Each candidate gets the sum of its
reciprocal ranks across both rankings; memories appearing in only one
ranking still get a partial score (the other term is dropped). Both
candidate lists are over-fetched at ``k * 2`` so a memory dominant in
only one channel has a fair chance to surface. The Python-side
significance + recency re-rank is then applied as a final pass to
break ties in favour of more important / more recent memories.
When ``query_vector`` is None: FTS-only behaviour unchanged — all
Phase 1-3.5 callers see the same row shape and ordering as before.
"""
if witness_role not in _VALID_WITNESS_ROLES:
raise ValueError(
@@ -148,7 +176,10 @@ def search_memories(
select_list = ", ".join(f"m.{c}" for c in cols)
# Over-fetch from FTS so the Python-side re-rank has room to reorder
# results that BM25 alone would have demoted past the top-k boundary.
over_fetch = max(k * 4, 20)
# When fusing with a vector ranking, we still over-fetch (k*2 from each
# channel) so memories that are weak in FTS but strong in vector — and
# vice versa — make it into the merge pool.
over_fetch = max(k * 2, 20) if query_vector is not None else max(k * 4, 20)
sql = (
f"SELECT {select_list}, memories_fts.rank AS fts_rank "
"FROM memories_fts "
@@ -165,11 +196,37 @@ def search_memories(
)
cur = conn.execute(sql, (owner_id, query, SIGNIFICANCE_RANK_BIAS, over_fetch))
rows = cur.fetchall()
if not rows:
return []
# Recency normalises against the current max id for this owner so the
# boost magnitude is bounded regardless of dataset size.
# FTS-only path: preserve pre-T96 behaviour exactly.
if query_vector is None:
if not rows:
return []
return _composite_rerank(conn, cols, rows, owner_id, k)
# Fused path: combine FTS candidates with vector candidates via RRF.
return _rrf_fuse_and_rerank(
conn,
cols=cols,
fts_rows=rows,
owner_id=owner_id,
witness_role=witness_role,
query_vector=query_vector,
k=k,
)
def _composite_rerank(
conn: Connection,
cols: list[str],
rows: list[tuple],
owner_id: str,
k: int,
) -> list[dict]:
"""Apply the significance + recency composite re-rank to FTS rows.
Extracted from ``search_memories`` so the no-vector path stays a single
call and the fused path can re-use the same boost formulae after RRF.
"""
max_id_row = conn.execute(
"SELECT MAX(id) FROM memories WHERE owner_id = ?", (owner_id,)
).fetchone()
@@ -187,3 +244,129 @@ def search_memories(
enriched.sort(key=lambda x: x["composite_score"])
return enriched[:k]
def _rrf_fuse_and_rerank(
conn: Connection,
*,
cols: list[str],
fts_rows: list[tuple],
owner_id: str,
witness_role: str,
query_vector: list[float],
k: int,
) -> list[dict]:
"""Merge FTS + vector candidates via reciprocal-rank fusion, then apply
the existing significance + recency boost as a final tie-breaker.
RRF formula (Cormack et al. 2009)::
fusion_score = sum over rankings r of 1 / (RRF_CONST + rank_r)
where ``rank_r`` is the 0-indexed position of the memory in ranking r.
"Missing from a ranking" is handled by SKIPPING the term for that
ranking — i.e. that channel contributes 0 to the sum, which preserves
the fairness property: a memory that only appears in one ranking is
not penalised relative to itself, just relative to memories that
appeared in both. This matches the canonical RRF presentation.
The final composite score subtracted from the *negated* fusion score
is::
composite = -fusion - sig_boost - recency_boost
Sorted ascending, smaller-is-better — the same ordering convention as
the FTS-only path so the Python-side significance + recency boosts
apply as a clean tie-breaker without inverting any sign.
"""
# Lazy import to avoid a hard module-level cycle: vector_search reads
# from chat.state.embeddings, which is itself a sibling of this module.
from chat.services.vector_search import vector_search
fts_rank_by_id: dict[int, int] = {}
fts_row_by_id: dict[int, tuple] = {}
id_idx = cols.index("id")
for rank, row in enumerate(fts_rows):
memory_id = row[id_idx]
fts_rank_by_id[memory_id] = rank
fts_row_by_id[memory_id] = row
# Over-fetch the vector channel symmetrically so each channel gets a
# fair shot at surfacing its strongest candidates.
vec_over_fetch = max(k * 2, 20)
vec_hits = vector_search(
conn,
owner_id=owner_id,
witness_role=witness_role,
query_vector=query_vector,
k=vec_over_fetch,
)
vec_rank_by_id: dict[int, int] = {
hit["memory_id"]: rank for rank, hit in enumerate(vec_hits)
}
# If the vector channel returned nothing (no embeddings indexed), the
# fused path collapses cleanly to the FTS-only path. No error, no
# surprise zero-hit return.
if not vec_rank_by_id and not fts_row_by_id:
return []
if not vec_rank_by_id:
return _composite_rerank(conn, cols, fts_rows, owner_id, k)
# For any vector-only hits we don't have a full memory row for yet,
# fetch them in a single round-trip. The FTS row carries an ``fts_rank``
# column at the end; vector-only rows get ``None`` there.
missing_ids = [mid for mid in vec_rank_by_id if mid not in fts_row_by_id]
select_list = ", ".join(cols)
if missing_ids:
placeholders = ",".join("?" * len(missing_ids))
cur = conn.execute(
f"SELECT {select_list} FROM memories WHERE id IN ({placeholders})",
missing_ids,
)
for row in cur.fetchall():
# Pad with a None for the trailing ``fts_rank`` slot so the row
# shape matches FTS rows downstream.
fts_row_by_id[row[id_idx]] = tuple(row) + (None,)
# Compute fusion score per candidate. Missing-from-ranking terms are
# simply omitted from the sum.
all_ids = set(fts_rank_by_id) | set(vec_rank_by_id)
fusion_by_id: dict[int, float] = {}
for mid in all_ids:
score = 0.0
if mid in fts_rank_by_id:
score += 1.0 / (RRF_CONST + fts_rank_by_id[mid])
if mid in vec_rank_by_id:
score += 1.0 / (RRF_CONST + vec_rank_by_id[mid])
fusion_by_id[mid] = score
# Final composite re-rank: significance + recency boosts on top of the
# negated fusion score so the sort direction matches the FTS-only path.
max_id_row = conn.execute(
"SELECT MAX(id) FROM memories WHERE owner_id = ?", (owner_id,)
).fetchone()
max_id = max_id_row[0] if max_id_row and max_id_row[0] else 1
result_cols = cols + ["fts_rank"]
enriched: list[dict] = []
for mid in all_ids:
row = fts_row_by_id.get(mid)
if row is None:
# Defensive: a vector hit with no memory row would be a logic
# bug (vector_search joins memories), so just skip it rather
# than crash the whole search.
continue
d = dict(zip(result_cols, row))
sig_boost = _SIGNIFICANCE_WEIGHT * (d.get("significance") or 0)
recency_boost = _RECENCY_WEIGHT * ((d.get("id") or 0) / max_id)
fusion = fusion_by_id[mid]
# Sort ascending, smaller-is-better → negate fusion so a larger
# fusion score yields a smaller composite. Significance and recency
# boosts then act as tie-breakers exactly like the FTS-only path.
d["fusion_score"] = fusion
d["composite_score"] = -fusion - sig_boost - recency_boost
enriched.append(d)
enriched.sort(key=lambda x: x["composite_score"])
return enriched[:k]
+135
View File
@@ -16,6 +16,26 @@
<p class="muted">No active container.</p>
{% endif %}
<p>Time: {{ chat.time }}</p>
<form class="inline-edit"
hx-post="/chats/{{ chat.id }}/drawer/chat/narrative-anchor"
hx-target="#drawer" hx-swap="innerHTML">
<label>
Narrative anchor:
<input type="text" name="new_value" maxlength="500"
value="{{ chat.narrative_anchor or '' }}">
</label>
<button type="submit">Save</button>
</form>
<form class="inline-edit"
hx-post="/chats/{{ chat.id }}/drawer/chat/weather"
hx-target="#drawer" hx-swap="innerHTML">
<label>
Weather:
<input type="text" name="new_value" maxlength="500"
value="{{ chat.weather or '' }}">
</label>
<button type="submit">Save</button>
</form>
{% if scene %}
<form class="inline-edit"
hx-post="/chats/{{ chat.id }}/drawer/scene/close"
@@ -414,6 +434,121 @@
{% endif %}
</section>
<section class="drawer-section">
<h3>Branches</h3>
{% if branches %}
<ul class="branch-list">
{% for b in branches %}
<li class="branch-row{% if b.is_active %} branch-active{% endif %}">
<strong>{{ b.name }}</strong>
{% if b.is_active %}<span class="muted"> (active)</span>{% endif %}
<span class="muted"> &middot; {{ b.event_count }} events</span>
{% if not b.is_active %}
<form class="inline-edit"
hx-post="/chats/{{ chat.id }}/drawer/branch/switch"
hx-target="#drawer" hx-swap="innerHTML">
<input type="hidden" name="name" value="{{ b.name }}">
<button type="submit">Switch</button>
</form>
{% endif %}
</li>
{% endfor %}
</ul>
{% else %}
<p class="muted">No branches yet.</p>
{% endif %}
<details>
<summary>Create branch</summary>
<form class="inline-edit"
hx-post="/chats/{{ chat.id }}/drawer/branch/create"
hx-target="#drawer" hx-swap="innerHTML">
<label>
Name:
<input type="text" name="name" required
placeholder="e.g. experiment_a">
</label>
<label>
Origin event id:
<input type="number" name="origin_event_id" required min="0">
</label>
<button type="submit">Create</button>
</form>
</details>
</section>
<section class="drawer-section">
<h3>Recent turns</h3>
{% if recent_turns %}
<ul class="recent-turns-list">
{% for t in recent_turns %}
<li class="turn-row{% if t.hidden %} turn-hidden{% endif %}">
<span class="muted">#{{ t.event_id }} {{ t.kind }}</span>
<strong>{{ t.speaker }}:</strong>
{{ t.excerpt }}{% if t.excerpt|length >= 120 %}…{% endif %}
<form class="inline-edit"
hx-post="/chats/{{ chat.id }}/drawer/turn/hide/{{ t.event_id }}"
hx-target="#drawer" hx-swap="innerHTML">
<input type="hidden" name="hidden" value="{{ 0 if t.hidden else 1 }}">
<label>
<input type="checkbox" {% if t.hidden %}checked{% endif %}
onchange="this.form.requestSubmit()">
hide from view
</label>
</form>
</li>
{% endfor %}
</ul>
{% else %}
<p class="muted">No turns yet.</p>
{% endif %}
</section>
<section class="drawer-section">
<h3>Significance review</h3>
{% set total_mem = significance_distribution.values()|sum %}
{% if total_mem %}
<ul class="significance-distribution">
{% for level in [0, 1, 2, 3] %}
{% set count = significance_distribution[level] %}
{% set marker = ['·','•','★','★★'][level] %}
{% set pct = (100 * count / total_mem)|round(0, 'floor')|int if total_mem else 0 %}
<li class="sig-bar sig-{{ level }}">
<span class="sig-label">{{ marker }} ({{ level }})</span>
<span class="sig-bar-fill" style="width: {{ pct }}%"></span>
<span class="sig-count">{{ count }}</span>
</li>
{% endfor %}
</ul>
{% else %}
<p class="muted">No memories yet.</p>
{% endif %}
{% if recent_memories %}
<details>
<summary>Edit significance (recent memories)</summary>
<ul class="significance-edit-list">
{% for m in recent_memories %}
<li>
<span class="sig sig-{{ m.significance }}">{{ ['·','•','★','★★'][m.significance|default(0)] }}</span>
{{ m.pov_summary[:80] }}{% if m.pov_summary|length > 80 %}…{% endif %}
<form class="inline-edit"
hx-post="/chats/{{ chat.id }}/drawer/memory/{{ m.id }}/significance"
hx-target="#drawer" hx-swap="innerHTML">
<label>
Significance:
<input type="range" name="significance" min="0" max="3"
value="{{ m.significance|default(0) }}"
oninput="this.nextElementSibling.value = this.value">
<output>{{ m.significance|default(0) }}</output>
</label>
<button type="submit">Save</button>
</form>
</li>
{% endfor %}
</ul>
</details>
{% endif %}
</section>
<section class="drawer-section">
<h3>Pinned memories ({{ pinned|length }} / {{ pin_cap }})</h3>
{% if pinned %}
+8
View File
@@ -5,8 +5,16 @@
<ul>
<li><a href="/chats" class="{% if active_nav == 'chats' %}active{% endif %}">Chats</a></li>
<li><a href="/bots" class="{% if active_nav == 'bots' %}active{% endif %}">Bots</a></li>
<li><a href="/snapshots" class="{% if active_nav == 'snapshots' %}active{% endif %}">Snapshots</a></li>
<li><a href="/settings" class="{% if active_nav == 'settings' %}active{% endif %}">Settings</a></li>
</ul>
{# T100: cross-chat search box. GET /search so the URL is shareable
and back-button friendly; the results page itself re-renders this
form with the query pre-filled. #}
<form class="rail-search" action="/search" method="get" role="search">
<input type="search" name="q" placeholder="Search" aria-label="Search memories">
<button type="submit">Go</button>
</form>
</nav>
<main class="content">
{% block content %}{% endblock %}
+37
View File
@@ -0,0 +1,37 @@
{% extends "layout.html" %}
{% block title %}Search - chat{% endblock %}
{% block content %}
<header class="page-header">
<h1>Search</h1>
</header>
<form class="search-page-form" action="/search" method="get">
<input type="text" name="q" value="{{ query|default('', true) }}"
placeholder="Search memories across all chats" autofocus>
<button type="submit">Search</button>
</form>
{% if not query %}
{# Empty-state placeholder: the top-bar form submits to /search even
with no input, so this page must render cleanly with no query. #}
<p class="muted search-empty">Enter a query to search memories across all chats.</p>
{% elif not results %}
<p class="muted">No matches for &ldquo;{{ query }}&rdquo;.</p>
{% else %}
<ul class="search-results">
{% for r in results %}
<li class="search-result">
<a class="search-result-link" href="/chats/{{ r.chat_id }}">
<div class="search-result-meta muted">
<strong>{{ r.owner_name }}</strong>
<span>&middot; {{ r.chat_id }}</span>
{% if r.chat_name %}<span>&middot; {{ r.chat_name }}</span>{% endif %}
{% if r.scene_label %}<span>&middot; scene {{ r.scene_label }}</span>{% endif %}
</div>
<div class="search-result-summary">{{ r.pov_summary }}</div>
</a>
</li>
{% endfor %}
</ul>
{% endif %}
{% endblock %}
+66
View File
@@ -0,0 +1,66 @@
{% extends "layout.html" %}
{% block title %}Snapshots - chat{% endblock %}
{% block content %}
<header class="page-header">
<h1>Snapshots</h1>
<form method="post" action="/snapshots/take" class="inline-edit">
<button type="submit">Take snapshot now</button>
</form>
</header>
{% if preview %}
<section class="snapshot-preview">
<h2>Preview: {{ preview.snapshot_id }}</h2>
<dl>
<dt>kind</dt><dd>{{ preview.kind }}</dd>
<dt>filename</dt><dd>{{ preview.filename }}</dd>
<dt>file size (bytes)</dt><dd>{{ preview.file_size_bytes }}</dd>
<dt>snapshot last_event_id</dt><dd>{{ preview.last_event_id }}</dd>
<dt>current event_log max id</dt><dd>{{ preview.current_event_log_max_id }}</dd>
<dt>events since snapshot</dt><dd>{{ preview.event_delta }}</dd>
<dt>events stored in snapshot</dt><dd>{{ preview.event_log_rows_in_snapshot }}</dd>
</dl>
</section>
{% endif %}
{% if snapshots %}
<table class="snapshot-list">
<thead>
<tr>
<th>ID</th>
<th>Kind</th>
<th>Created (UTC)</th>
<th>Size (bytes)</th>
<th>last_event_id</th>
<th>Actions</th>
</tr>
</thead>
<tbody>
{% for snap in snapshots %}
<tr>
<td>{{ snap.snapshot_id }}</td>
<td>{{ snap.kind }}</td>
<td>{{ snap.created_at }}</td>
<td>{{ snap.file_size_bytes }}</td>
<td>{{ snap.last_event_id if snap.last_event_id is not none else '?' }}</td>
<td>
<a href="/snapshots/{{ snap.snapshot_id }}/preview?kind={{ snap.kind }}">Preview</a>
<details class="snapshot-row-restore">
<summary>Restore</summary>
<form method="post" action="/snapshots/restore/{{ snap.snapshot_id }}" class="inline-edit">
<input type="hidden" name="kind" value="{{ snap.kind }}">
<label>Type "{{ snap.snapshot_id }}" to confirm:
<input type="text" name="confirm_id" required>
</label>
<button type="submit">Restore from this snapshot</button>
</form>
</details>
</td>
</tr>
{% endfor %}
</tbody>
</table>
{% else %}
<p class="muted">No snapshots yet. Use "Take snapshot now" to create one.</p>
{% endif %}
{% endblock %}
+398
View File
@@ -36,7 +36,14 @@ from fastapi.responses import HTMLResponse
from fastapi.templating import Jinja2Templates
from chat.eventlog.log import append_and_apply
from chat.services.branching import (
branch_from_event,
list_branches_with_metadata,
switch_active_branch,
)
from chat.services.delete_impact import compute_delete_impact
from chat.services.relationship_seed import seed_inter_bot_edges
from chat.services.rewind import execute_rewind
from chat.services.scene_summarize import apply_scene_close_summary
from chat.state.edges import get_edge
from chat.state.entities import get_bot, get_you, list_bots
@@ -169,6 +176,63 @@ async def drawer(chat_id: str, request: Request, conn=Depends(get_conn)):
active_events = list_active_events(conn, chat_id)
open_threads = list_open_threads(conn, chat_id)
# T98.3: recent turns (user_turn / assistant_turn) for the hide-from-view
# panel. Includes ``hidden`` rows so the user can un-hide them — the
# filter on the read side (read_recent_dialogue) is what drops hidden
# rows from the prompt; the drawer panel always shows everything.
turn_rows = conn.execute(
"""
SELECT id, kind, payload_json, hidden
FROM event_log
WHERE kind IN ('user_turn', 'assistant_turn', 'user_turn_edit')
AND superseded_by IS NULL
ORDER BY id DESC
LIMIT ?
""",
(RECENT_LIMIT,),
).fetchall()
recent_turns: list[dict] = []
for row in turn_rows:
try:
payload = json.loads(row[2]) if row[2] else {}
except (json.JSONDecodeError, TypeError):
payload = {}
if payload.get("chat_id") != chat_id:
continue
text = payload.get("prose") or payload.get("text") or ""
speaker = payload.get("speaker_id") or (
"you" if row[1].startswith("user") else "?"
)
recent_turns.append(
{
"event_id": int(row[0]),
"kind": row[1],
"speaker": speaker,
"excerpt": (text or "").replace("\n", " ")[:120],
"hidden": bool(row[3]),
}
)
# T98.1: branch metadata (every chat sees the global branch list — branches
# may be chat-scoped or global, so :func:`list_branches_with_metadata`
# returns both flavours and the template highlights the active one).
branches = list_branches_with_metadata(conn, chat_id)
# T98.2: significance distribution across this chat's memories. Powers
# the "Significance review" panel — a small histogram letting authors
# spot lopsided buckets (e.g. nothing significant=3 yet) and triage by
# editing individual memory significance values.
sig_rows = conn.execute(
"SELECT significance, COUNT(*) FROM memories "
"WHERE chat_id = ? GROUP BY significance ORDER BY significance",
(chat_id,),
).fetchall()
significance_distribution = {int(r[0]): int(r[1]) for r in sig_rows}
# Ensure every bucket 0..3 is present so the bar-chart template can
# render a stable axis even when a level has zero rows.
for level in (0, 1, 2, 3):
significance_distribution.setdefault(level, 0)
return TEMPLATES.TemplateResponse(
request,
"_drawer.html",
@@ -196,6 +260,9 @@ async def drawer(chat_id: str, request: Request, conn=Depends(get_conn)):
"pin_cap": PIN_CAP,
"active_events": active_events,
"open_threads": open_threads,
"branches": branches,
"significance_distribution": significance_distribution,
"recent_turns": recent_turns,
},
)
@@ -993,6 +1060,7 @@ async def skip_elision(
chat_id=chat_id,
new_time=new_time,
landing_state_hint=landing_state_hint,
app=request.app,
)
except ChatNotFoundError as exc:
# Missing chat row: typed exception (T81) replaces the prior
@@ -1036,6 +1104,7 @@ async def skip_jump(
new_time=new_time,
notable_prose=notable_prose,
reset_activity=reset_flag,
app=request.app,
)
except ChatNotFoundError as exc:
# Missing chat row: typed exception (T81) replaces the prior
@@ -1078,3 +1147,332 @@ async def close_thread(
},
)
return await drawer(chat_id, request, conn)
# --- T98.1 branching UI --------------------------------------------------
#
# Three POST endpoints wired to the Phase 4 :mod:`chat.services.branching`
# helpers. The drawer's "Branches" panel exposes:
#
# * Create from a free-form ``origin_event_id``.
# * Switch the active branch by name.
# * Convenience "branch from this turn" against a per-turn event_id (the
# chat surface stamps ``id="turn-<event_id>"`` on every turn so users can
# pick the right one without copying ids by hand).
#
# All three return the refreshed drawer partial; failures from the service
# layer (duplicate name, unknown branch, invalid origin) surface as 400 so
# HTMX displays the inline error.
@router.post(
"/chats/{chat_id}/drawer/branch/create",
response_class=HTMLResponse,
)
async def create_branch(
chat_id: str,
request: Request,
name: str = Form(...),
origin_event_id: int = Form(...),
conn=Depends(get_conn),
):
chat = get_chat(conn, chat_id)
if chat is None:
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
try:
branch_from_event(
conn,
name=name,
origin_event_id=int(origin_event_id),
chat_id=chat_id,
)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
return await drawer(chat_id, request, conn)
@router.post(
"/chats/{chat_id}/drawer/branch/switch",
response_class=HTMLResponse,
)
async def switch_branch(
chat_id: str,
request: Request,
name: str = Form(...),
conn=Depends(get_conn),
):
chat = get_chat(conn, chat_id)
if chat is None:
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
try:
switch_active_branch(conn, name=name)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
return await drawer(chat_id, request, conn)
@router.get(
"/chats/{chat_id}/drawer/turn/delete-preview/{event_id}",
response_class=HTMLResponse,
)
async def delete_preview(
chat_id: str,
event_id: int,
request: Request,
conn=Depends(get_conn),
):
"""Render an :class:`ImpactReport` for ``event_id`` as a small modal.
Read-only :func:`compute_delete_impact` does not mutate the
database. The modal contains a confirmation form posting to
:func:`delete_turn` below; HTMX swaps the fragment into a modal
target on the chat page.
"""
chat = get_chat(conn, chat_id)
if chat is None:
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
report = compute_delete_impact(conn, target_event_id=int(event_id))
# Build the modal HTML directly — the impact report is small and
# reusing the drawer template would require a fragment include just
# for this surface. Mirrors the rewind-preview style in
# :func:`chat.web.turns.rewind_preview`.
items_html = "".join(
f"<li><strong>{item.kind}</strong>: {item.description}</li>"
for item in report.cascading
)
notes_html = "".join(f"<li>{note}</li>" for note in report.notes)
body = (
"<div class='delete-impact-modal'>"
f"<h3>Delete event {report.target_event_id}?</h3>"
f"<p>This will discard {len(report.cascading)} events. Cascade:</p>"
f"<ul class='delete-impact-cascade'>{items_html or '<li>none</li>'}</ul>"
f"<ul class='delete-impact-notes'>{notes_html}</ul>"
f"<form hx-post='/chats/{chat_id}/drawer/turn/delete/{report.target_event_id}' "
"hx-target='#drawer' hx-swap='innerHTML'>"
"<button type='submit'>Confirm delete</button>"
"</form>"
"</div>"
)
return HTMLResponse(body)
@router.post(
"/chats/{chat_id}/drawer/turn/delete/{event_id}",
response_class=HTMLResponse,
)
async def delete_turn(
chat_id: str,
event_id: int,
request: Request,
conn=Depends(get_conn),
):
"""Delete a turn (and everything after) by invoking the existing rewind path.
The :func:`chat.services.rewind.execute_rewind` API takes
``after_event_id``: it removes events with id strictly greater than
that argument. To make ``event_id`` itself disappear we pass
``after_event_id = event_id - 1`` a thin adapter, not a
re-implementation of rewind.
A snapshot is taken before truncation (inside ``execute_rewind``)
so the user can recover via the snapshot index.
"""
chat = get_chat(conn, chat_id)
if chat is None:
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
settings = request.app.state.settings
execute_rewind(
db_path=settings.db_path,
data_dir=settings.data_dir,
after_event_id=int(event_id) - 1,
)
# ``conn`` is now stale (the rewind opened its own connection and
# truncated/reprojected). Re-render the drawer through a fresh open
# so the partial reflects the truncated state.
from chat.db.connection import open_db
with open_db(settings.db_path) as fresh:
return await drawer(chat_id, request, fresh)
@router.post(
"/chats/{chat_id}/drawer/turn/hide/{event_id}",
response_class=HTMLResponse,
)
async def hide_turn(
chat_id: str,
event_id: int,
request: Request,
hidden: int = Form(...),
conn=Depends(get_conn),
):
"""Toggle ``event_log.hidden`` on a turn via the ``turn_hidden``
``manual_edit`` projector branch.
The route validates the target is an actual turn-shaped row in this
chat (so a stray click on the chat panel can't hide a system event)
and snapshots the prior ``hidden`` value for §6.4 reversibility.
"""
chat = get_chat(conn, chat_id)
if chat is None:
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
row = conn.execute(
"SELECT kind, payload_json, hidden FROM event_log WHERE id = ?",
(int(event_id),),
).fetchone()
if row is None:
raise HTTPException(
status_code=404, detail=f"event not found: {event_id}"
)
if row[0] not in ("user_turn", "assistant_turn", "user_turn_edit"):
raise HTTPException(
status_code=400,
detail=f"event {event_id} is not a turn (kind={row[0]})",
)
try:
payload = json.loads(row[1]) if row[1] else {}
except (json.JSONDecodeError, TypeError):
payload = {}
if payload.get("chat_id") != chat_id:
raise HTTPException(
status_code=404,
detail=f"event {event_id} not in chat {chat_id}",
)
prior_hidden = 1 if int(row[2]) else 0
new_hidden = 1 if int(hidden) else 0
append_and_apply(
conn,
kind="manual_edit",
payload={
"target_kind": "turn_hidden",
"target_id": int(event_id),
"prior_value": {"hidden": prior_hidden},
"new_value": {"hidden": new_hidden},
},
)
return await drawer(chat_id, request, conn)
# --- T98.5 chat narrative anchor + weather ----------------------------
#
# Audit (T98.5) found two §6.4 fields without drawer affordances despite
# both being prose strings stored on ``chat_state``: ``narrative_anchor``
# (the "Day 1" / "morning of the gala" hint above the chat clock) and
# ``weather``. Both land via the existing ``manual_edit`` projector with
# new branches added in :mod:`chat.state.manual_edit`. The container
# ``properties_json`` blob is more invasive — bounded JSON edits aren't
# wired through manual_edit and the drawer never surfaces multiple
# containers at once, so it stays out of v1.
CHAT_NARRATIVE_ANCHOR_MAX = 500
CHAT_WEATHER_MAX = 500
@router.post(
"/chats/{chat_id}/drawer/chat/narrative-anchor",
response_class=HTMLResponse,
)
async def edit_chat_narrative_anchor(
chat_id: str,
request: Request,
new_value: str = Form(...),
conn=Depends(get_conn),
):
chat = get_chat(conn, chat_id)
if chat is None:
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
if len(new_value) > CHAT_NARRATIVE_ANCHOR_MAX:
raise HTTPException(
status_code=400,
detail=(
f"narrative_anchor exceeds {CHAT_NARRATIVE_ANCHOR_MAX} chars "
f"(got {len(new_value)})"
),
)
prior = chat.get("narrative_anchor") or ""
append_and_apply(
conn,
kind="manual_edit",
payload={
"target_kind": "chat_narrative_anchor",
"target_id": chat_id,
"prior_value": prior,
"new_value": new_value,
},
)
return await drawer(chat_id, request, conn)
@router.post(
"/chats/{chat_id}/drawer/chat/weather",
response_class=HTMLResponse,
)
async def edit_chat_weather(
chat_id: str,
request: Request,
new_value: str = Form(...),
conn=Depends(get_conn),
):
chat = get_chat(conn, chat_id)
if chat is None:
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
if len(new_value) > CHAT_WEATHER_MAX:
raise HTTPException(
status_code=400,
detail=(
f"weather exceeds {CHAT_WEATHER_MAX} chars "
f"(got {len(new_value)})"
),
)
prior = chat.get("weather") or ""
append_and_apply(
conn,
kind="manual_edit",
payload={
"target_kind": "chat_weather",
"target_id": chat_id,
"prior_value": prior,
"new_value": new_value,
},
)
return await drawer(chat_id, request, conn)
@router.post(
"/chats/{chat_id}/drawer/branch/from-turn/{event_id}",
response_class=HTMLResponse,
)
async def branch_from_turn(
chat_id: str,
event_id: int,
request: Request,
name: str = Form(...),
conn=Depends(get_conn),
):
"""Convenience: branch from a specific turn event.
Identical to :func:`create_branch` except ``origin_event_id`` is
encoded in the URL the chat surface renders one such form per turn
so users can fork mid-conversation without authoring an event id by
hand.
"""
chat = get_chat(conn, chat_id)
if chat is None:
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
try:
branch_from_event(
conn,
name=name,
origin_event_id=int(event_id),
chat_id=chat_id,
)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
return await drawer(chat_id, request, conn)
+2
View File
@@ -131,6 +131,7 @@ async def process_meanwhile_turn(
*,
chat_id: str,
prose: str,
app=None,
) -> dict:
"""Run one meanwhile turn end-to-end.
@@ -314,6 +315,7 @@ async def process_meanwhile_turn(
narrative_text=text,
scene_id=scene_id,
chat_clock_at=chat.get("time"),
app=app,
)
# 9. Post-turn state-update — exactly 2 directed pairs over the
+92
View File
@@ -0,0 +1,92 @@
"""T100 (Phase 4): cross-chat search UX route.
Wraps T93's :func:`chat.services.cross_chat_search.search_all_memories`
in a small read-only HTML surface so the top-bar search input has
somewhere to land. The route does no filtering of its own beyond the
empty-query fast-path that T93 already implements; ranking, owner
scope, and witness scope all live in the service layer.
For each match we hydrate just enough metadata to render a row:
* the owner bot's display name (so users see "BOTA" not "bot_a"),
* the originating ``chat_id`` (the link target there's no per-turn
anchor today because memories don't carry an ``event_id`` column,
so we deep-link to the chat as a whole),
* the originating scene title when one exists,
* and the ``pov_summary`` itself.
We deliberately keep this module synchronous and template-only no
HTMX swaps, no JSON API because the search box is a "leave the
current chat to look something up" surface, not an inline drawer.
"""
from __future__ import annotations
from pathlib import Path
from fastapi import APIRouter, Depends, Request
from fastapi.responses import HTMLResponse
from fastapi.templating import Jinja2Templates
from chat.services.cross_chat_search import search_all_memories
from chat.state.entities import get_bot
from chat.state.world import get_chat, get_scene
from chat.web.bots import get_conn
TEMPLATES = Jinja2Templates(
directory=str(Path(__file__).resolve().parent.parent / "templates")
)
router = APIRouter()
@router.get("/search", response_class=HTMLResponse)
async def search(request: Request, q: str = "", conn=Depends(get_conn)):
"""Render ``search.html`` with up to 50 cross-chat FTS matches.
``q`` is intentionally allowed to be empty that path renders the
page's "enter a query" placeholder rather than a 400, because the
top-bar form submits to this URL even with an empty input. T93's
service short-circuits whitespace-only queries to ``[]`` so there
is no FTS5 ``MATCH ''`` syntax error to guard against here.
"""
raw_results = search_all_memories(conn, query=q, k=50) if q else []
# Hydrate display fields per row. We do this in the route (not the
# service) so the service stays a pure FTS shim that other UIs
# can reuse.
results = []
for row in raw_results:
bot = get_bot(conn, row["owner_id"])
chat = get_chat(conn, row["chat_id"])
scene = get_scene(conn, row["scene_id"]) if row["scene_id"] else None
results.append(
{
"memory_id": row["memory_id"],
"owner_id": row["owner_id"],
"owner_name": bot["name"] if bot else row["owner_id"],
"chat_id": row["chat_id"],
"chat_name": (
chat.get("narrative_anchor") if chat else None
),
"scene_id": row["scene_id"],
# Scenes have no ``title`` column today; surface the
# ``started_at`` timestamp as a human-friendly label
# when a scene is set, otherwise leave it blank.
"scene_label": (
scene.get("started_at") if scene else None
),
"pov_summary": row["pov_summary"],
"significance": row["significance"],
"ts": row["ts"],
}
)
return TEMPLATES.TemplateResponse(
request,
"search.html",
{
"query": q,
"results": results,
"active_nav": "search",
},
)
+3
View File
@@ -91,6 +91,7 @@ async def process_elision_skip(
chat_id: str,
new_time: str,
landing_state_hint: str = "",
app=None,
) -> dict:
"""Run an elision skip end-to-end.
@@ -175,6 +176,7 @@ async def process_jump_skip(
new_time: str,
notable_prose: str = "",
reset_activity: bool = False,
app=None,
) -> dict:
"""Run a jump skip end-to-end.
@@ -254,6 +256,7 @@ async def process_jump_skip(
chat_clock_at=new_time,
source="synthesized",
significance=mem.significance,
app=app,
)
narration = await narrate_skip(
+190
View File
@@ -0,0 +1,190 @@
"""Snapshot UX routes (T99).
Surfaces the existing snapshot service (``chat/services/snapshot.py``)
through HTML so the user can see, take, restore, and preview snapshots
without dropping to a shell.
Routes:
* ``GET /snapshots`` list all snapshots (both kinds)
* ``POST /snapshots/take`` take a periodic snapshot now
* ``POST /snapshots/restore/{id}`` restore (requires matching ``confirm_id``)
* ``GET /snapshots/{id}/preview`` show metadata + delta vs current
The ``snapshot_id`` is the filename stem (the UTC timestamp written by
:func:`chat.services.snapshot.take_snapshot`) there's no separate UUID,
and the timestamp filename is already unique per snapshot kind. Both
periodic and rewind snapshots share the same id space lookup-wise, so
the restore + preview routes accept ``kind`` as a form/query param to
disambiguate.
"""
from __future__ import annotations
import json
from pathlib import Path
from fastapi import APIRouter, Depends, Form, HTTPException, Request
from fastapi.responses import HTMLResponse, RedirectResponse
from fastapi.templating import Jinja2Templates
from chat.services.snapshot import (
restore_from_snapshot,
take_snapshot,
)
from chat.web.bots import get_conn
TEMPLATES = Jinja2Templates(
directory=str(Path(__file__).resolve().parent.parent / "templates")
)
router = APIRouter()
SNAPSHOT_KINDS = ("periodic", "rewind")
def _list_all_snapshots(data_dir: Path) -> list[dict]:
"""Walk ``data/snapshots/{kind}/`` for both kinds and collect metadata.
Each entry exposes the fields the template needs: ``snapshot_id``
(filename stem), ``kind``, ``created_at`` (file mtime as ISO), the
on-disk ``file_size_bytes``, and the snapshot's stored
``last_event_id`` (parsed from the JSON body small enough that
listing isn't a performance concern for the handful of files we keep).
"""
from datetime import datetime, timezone
rows: list[dict] = []
for kind in SNAPSHOT_KINDS:
snap_dir = data_dir / "snapshots" / kind
if not snap_dir.exists():
continue
for path in sorted(snap_dir.glob("*.json")):
try:
dump = json.loads(path.read_text())
last_event_id = dump.get("last_event_id", 0)
except (OSError, json.JSONDecodeError):
# Corrupt or unreadable files still get listed so the
# user can see and delete them; just don't crash here.
last_event_id = None
stat = path.stat()
rows.append(
{
"snapshot_id": path.stem,
"kind": kind,
"created_at": datetime.fromtimestamp(
stat.st_mtime, tz=timezone.utc
).isoformat(),
"file_size_bytes": stat.st_size,
"last_event_id": last_event_id,
"filename": path.name,
}
)
# Newest first for display.
rows.sort(key=lambda r: r["created_at"], reverse=True)
return rows
def _resolve_snapshot_path(
data_dir: Path, snapshot_id: str, kind: str
) -> Path:
"""Map an ``(id, kind)`` pair to the on-disk file, or 404."""
if kind not in SNAPSHOT_KINDS:
raise HTTPException(status_code=400, detail=f"unknown kind: {kind}")
path = data_dir / "snapshots" / kind / f"{snapshot_id}.json"
if not path.exists():
raise HTTPException(status_code=404, detail="snapshot not found")
return path
@router.get("/snapshots", response_class=HTMLResponse)
async def snapshots_list(request: Request):
settings = request.app.state.settings
rows = _list_all_snapshots(settings.data_dir)
return TEMPLATES.TemplateResponse(
request,
"snapshots.html",
{"snapshots": rows, "active_nav": "snapshots"},
)
@router.post("/snapshots/take")
async def snapshots_take(request: Request, conn=Depends(get_conn)):
"""Take a periodic snapshot now.
We use ``kind="periodic"`` for manual snapshots since they're
user-initiated checkpoints, not pre-rewind safety dumps. They count
against the 5-snapshot retention but that's fine — manual ones are
the most recent so they're the last to be pruned.
"""
settings = request.app.state.settings
take_snapshot(conn, data_dir=settings.data_dir, kind="periodic")
return RedirectResponse(url="/snapshots", status_code=303)
@router.post("/snapshots/restore/{snapshot_id}")
async def snapshots_restore(
snapshot_id: str,
request: Request,
confirm_id: str = Form(""),
kind: str = Form("periodic"),
conn=Depends(get_conn),
):
"""Hard-confirm restore: ``confirm_id`` must equal the path id.
Mismatched confirm 400 (without touching the DB). On match, the
existing :func:`restore_from_snapshot` clears projected tables and
re-loads them from the dump.
"""
if confirm_id != snapshot_id:
raise HTTPException(
status_code=400,
detail="confirm_id does not match snapshot id",
)
settings = request.app.state.settings
path = _resolve_snapshot_path(settings.data_dir, snapshot_id, kind)
restore_from_snapshot(conn, path)
return RedirectResponse(url="/snapshots", status_code=303)
@router.get("/snapshots/{snapshot_id}/preview", response_class=HTMLResponse)
async def snapshots_preview(
snapshot_id: str,
request: Request,
kind: str = "periodic",
conn=Depends(get_conn),
):
"""Show snapshot metadata + a basic delta against the current event log.
Phase 4 keeps this simple: the snapshot's ``last_event_id`` plus the
current ``MAX(event_log.id)`` is enough to tell the user how far the
log has moved on. A richer per-table diff is a Phase 4.5+ concern.
"""
settings = request.app.state.settings
path = _resolve_snapshot_path(settings.data_dir, snapshot_id, kind)
dump = json.loads(path.read_text())
last_event_id = dump.get("last_event_id", 0)
cur = conn.execute("SELECT MAX(id) FROM event_log")
row = cur.fetchone()
current_max_id = row[0] if row[0] is not None else 0
stat = path.stat()
return TEMPLATES.TemplateResponse(
request,
"snapshots.html",
{
"snapshots": _list_all_snapshots(settings.data_dir),
"active_nav": "snapshots",
"preview": {
"snapshot_id": snapshot_id,
"kind": kind,
"filename": path.name,
"file_size_bytes": stat.st_size,
"last_event_id": last_event_id,
"current_event_log_max_id": current_max_id,
"event_delta": current_max_id - last_event_id,
"event_log_rows_in_snapshot": len(dump.get("event_log", [])),
},
},
)
+5
View File
@@ -248,6 +248,7 @@ async def post_turn(
settings,
chat_id=chat_id,
prose=prose,
app=request.app,
)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
@@ -352,6 +353,7 @@ async def post_turn(
new_time=new_time,
landing_state_hint=getattr(parsed, "landing_state_hint", "")
or "",
app=request.app,
)
except ChatNotFoundError as exc:
# Defensive: chat existence is checked above, so this only
@@ -512,6 +514,7 @@ async def post_turn(
narrative_text=primary_text,
scene_id=scene["id"] if scene else None,
chat_clock_at=chat.get("time"),
app=request.app,
)
# 7b. Post-turn state-update pass (Requirements §3.4 / T40). All
@@ -746,6 +749,7 @@ async def post_turn(
narrative_text=interjection_text,
scene_id=scene["id"] if scene else None,
chat_clock_at=chat.get("time"),
app=request.app,
)
# T74.2: enqueue a significance pass for the interjection
@@ -1092,6 +1096,7 @@ async def regenerate_turn(
chat_id=chat_id,
original_assistant_event_id=event_id,
edited_user_prose=edited_prose,
app=request.app,
)
except ValueError as e:
raise HTTPException(status_code=404, detail=str(e))
@@ -520,6 +520,8 @@ Written per witness when a scene closes. Different details, different interpreta
### Phase 4 — polish
**Status: shipped 2026-04-27** (T88T102, 15 tasks across 8 waves; +70 tests). See "Phase 4 status" in CLAUDE.md for the per-task breakdown. Vector retrieval shipped via pure-Python cosine over a JSON-blob embeddings table (sqlite-vec deferred — host Python lacks loadable extensions); branching is data-model + drawer UI; significance review, hide-from-view soft delete, surgical delete with cascade preview, snapshot UX, and cross-chat search all surface from the drawer or top-bar.
- Vector retrieval (sqlite-vss or sqlite-vec).
- Branching UI.
- Drawer-edit on every field.
+97
View File
@@ -0,0 +1,97 @@
"""Backfill embeddings for memories that lack them (T97, Phase 4).
Walks all memories where no row exists in the ``embeddings`` table. For
each, calls :func:`chat.services.embeddings.generate_embedding` and emits
an ``embedding_indexed`` event so the projector lands the vector.
Phase 4 ships the deterministic local pseudo-embedding so this script
runs synchronously without a network round-trip the LLMClient argument
is not needed on the pseudo path. Phase 4.5+ will need a real client.
Run from the repo root:
.venv/bin/python scripts/backfill_embeddings.py [--limit N] [--dry-run]
"""
from __future__ import annotations
import argparse
import asyncio
from chat.config import load_settings
from chat.db.connection import open_db
from chat.db.migrate import apply_migrations
from chat.eventlog.log import append_and_apply
from chat.services.embeddings import (
FALLBACK_EMBEDDING_MODEL,
generate_embedding,
)
# Trigger projector handler registration so ``append_and_apply`` lands
# the embedding rows correctly.
import chat.state.embeddings # noqa: F401
import chat.state.entities # noqa: F401
import chat.state.memory # noqa: F401
import chat.state.world # noqa: F401
async def main() -> None:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"--limit",
type=int,
default=None,
help="Cap the number of memories backfilled in this run.",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="Print the count of memories needing embeddings, then exit.",
)
args = parser.parse_args()
settings = load_settings()
settings.db_path.parent.mkdir(parents=True, exist_ok=True)
apply_migrations(settings.db_path)
with open_db(settings.db_path) as conn:
sql = (
"SELECT m.id, m.pov_summary FROM memories m "
"LEFT JOIN embeddings e ON e.memory_id = m.id "
"WHERE e.memory_id IS NULL "
"ORDER BY m.id"
)
if args.limit is not None:
sql += f" LIMIT {int(args.limit)}"
rows = conn.execute(sql).fetchall()
print(f"Found {len(rows)} memories needing embeddings.")
if args.dry_run:
return
indexed = 0
skipped = 0
for memory_id, text in rows:
result = await generate_embedding(
client=None, # pseudo path: no client needed
text=text or "",
)
if result.model == FALLBACK_EMBEDDING_MODEL:
print(f" Skipping memory_id={memory_id} (empty text)")
skipped += 1
continue
append_and_apply(
conn,
kind="embedding_indexed",
payload={
"memory_id": memory_id,
"model": result.model,
"dim": result.dim,
"vector": result.vector,
},
)
indexed += 1
print(f" Indexed memory_id={memory_id}")
print(f"Done. Indexed {indexed}, skipped {skipped}.")
if __name__ == "__main__":
asyncio.run(main())
+141
View File
@@ -0,0 +1,141 @@
from __future__ import annotations
from chat.db.connection import open_db
from chat.db.migrate import apply_migrations
from chat.eventlog.log import append_event
from chat.eventlog.projector import project
import chat.state.branches # registers handlers
from chat.state.branches import active_branch, get_branch, list_branches
def test_main_branch_bootstrapped_by_migration(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
active = active_branch(conn)
assert active is not None
assert active["name"] == "main"
assert active["is_active"] is True
assert active["origin_event_id"] == 0
assert active["head_event_id"] == 0
def test_branch_created_inserts_row(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
append_event(
conn,
kind="branch_created",
payload={
"name": "experiment",
"origin_event_id": 42,
"chat_id": "chat_a",
},
)
project(conn)
b = get_branch(conn, "experiment")
assert b is not None
assert b["name"] == "experiment"
assert b["origin_event_id"] == 42
# head defaults to origin when not specified
assert b["head_event_id"] == 42
assert b["chat_id"] == "chat_a"
assert b["is_active"] is False
# main remains active
active = active_branch(conn)
assert active is not None
assert active["name"] == "main"
def test_branch_switched_atomic(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
append_event(
conn,
kind="branch_created",
payload={
"name": "experiment",
"origin_event_id": 5,
"chat_id": "chat_a",
},
)
append_event(
conn,
kind="branch_switched",
payload={"name": "experiment"},
)
project(conn)
active = active_branch(conn)
assert active is not None
assert active["name"] == "experiment"
main = get_branch(conn, "main")
assert main is not None
assert main["is_active"] is False
# switch back
append_event(
conn,
kind="branch_switched",
payload={"name": "main"},
)
project(conn)
active2 = active_branch(conn)
assert active2 is not None
assert active2["name"] == "main"
experiment = get_branch(conn, "experiment")
assert experiment is not None
assert experiment["is_active"] is False
def test_branch_head_updated_changes_head(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
append_event(
conn,
kind="branch_created",
payload={
"name": "experiment",
"origin_event_id": 10,
"head_event_id": 10,
"chat_id": "chat_a",
},
)
append_event(
conn,
kind="branch_head_updated",
payload={"name": "experiment", "head_event_id": 20},
)
project(conn)
b = get_branch(conn, "experiment")
assert b is not None
assert b["head_event_id"] == 20
def test_list_branches_returns_all(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
append_event(
conn,
kind="branch_created",
payload={
"name": "experiment",
"origin_event_id": 1,
"chat_id": "chat_a",
},
)
project(conn)
names = [b["name"] for b in list_branches(conn)]
assert "main" in names
assert "experiment" in names
+131
View File
@@ -0,0 +1,131 @@
"""Tests for the branching service (T94, Phase 4)."""
from __future__ import annotations
import pytest
from chat.db.connection import open_db
from chat.db.migrate import apply_migrations
from chat.eventlog.log import append_and_apply
import chat.state.branches # noqa: F401 registers handlers
from chat.services.branching import (
branch_from_event,
list_branches_with_metadata,
switch_active_branch,
)
from chat.state.branches import active_branch, get_branch
def _seed_event(conn) -> int:
"""Append a benign event so we have a real event_log row to fork from.
``user_turn`` is a transcript-only kind with no registered projector
handler, so ``append_and_apply`` is a clean no-op on the projector
side regardless of what other handlers are imported by the suite.
"""
return append_and_apply(
conn,
kind="user_turn",
payload={"chat_id": "c1", "text": "hi"},
)
def test_branch_from_event_creates_branch_via_event(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
seed_id = _seed_event(conn)
new_id = branch_from_event(
conn,
name="experiment",
origin_event_id=seed_id,
chat_id="c1",
)
assert isinstance(new_id, int) and new_id > 0
b = get_branch(conn, "experiment")
assert b is not None
assert b["id"] == new_id
assert b["origin_event_id"] == seed_id
assert b["head_event_id"] == seed_id
assert b["chat_id"] == "c1"
assert b["is_active"] is False
# branch_created event landed in event_log
row = conn.execute(
"SELECT COUNT(*) FROM event_log WHERE kind = 'branch_created'"
).fetchone()
assert row[0] == 1
def test_branch_from_event_duplicate_name_raises(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
seed_id = _seed_event(conn)
branch_from_event(conn, name="dup", origin_event_id=seed_id)
with pytest.raises(ValueError, match="already exists"):
branch_from_event(conn, name="dup", origin_event_id=seed_id)
def test_branch_from_event_invalid_origin_raises(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
with pytest.raises(ValueError, match="does not exist"):
branch_from_event(conn, name="ghost", origin_event_id=99999)
def test_switch_active_branch_changes_active(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
seed_id = _seed_event(conn)
branch_from_event(conn, name="experiment", origin_event_id=seed_id)
switch_active_branch(conn, name="experiment")
active = active_branch(conn)
assert active is not None
assert active["name"] == "experiment"
# Switch back to main.
switch_active_branch(conn, name="main")
active2 = active_branch(conn)
assert active2 is not None
assert active2["name"] == "main"
def test_switch_active_branch_unknown_name_raises(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
with pytest.raises(ValueError, match="does not exist"):
switch_active_branch(conn, name="nope")
def test_list_branches_with_metadata_includes_event_count(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
# Seed enough events to cover origin=10 and head=15.
for _ in range(15):
_seed_event(conn)
# Create the branch at origin=10, then bump its head to 15.
branch_from_event(conn, name="exp", origin_event_id=10)
append_and_apply(
conn,
kind="branch_head_updated",
payload={"name": "exp", "head_event_id": 15},
)
rows = {b["name"]: b for b in list_branches_with_metadata(conn)}
# main: bootstrap state — origin=0, head=0 — event_count == 0.
assert rows["main"]["event_count"] == 0
# exp: origin=10, head=15 — event_count == 6 (inclusive).
assert rows["exp"]["origin_event_id"] == 10
assert rows["exp"]["head_event_id"] == 15
assert rows["exp"]["event_count"] == 6
+155
View File
@@ -0,0 +1,155 @@
"""T93 (Phase 4): cross-chat FTS5 search across all owners and chats.
Verifies that ``chat.services.cross_chat_search.search_all_memories``:
* surfaces matches across multiple owner_ids (the per-owner restriction
used by ``state.memory.search_memories`` is intentionally absent),
* applies no witness filter (admin/power-user surface),
* orders results by FTS5 BM25 rank (lower = stronger match, surfaced
first), and
* honours the ``k`` LIMIT and the empty-query fast-path.
"""
from __future__ import annotations
from chat.db.connection import open_db
from chat.db.migrate import apply_migrations
from chat.eventlog.log import append_event
from chat.eventlog.projector import project
from chat.services.cross_chat_search import search_all_memories
import chat.state.memory # noqa: F401 (registers memory_written handler)
def _seed(db, *, memory_specs):
"""Apply migrations + project a list of memory_written events."""
apply_migrations(db)
with open_db(db) as conn:
for spec in memory_specs:
payload = {
"owner_id": spec.get("owner_id", "bot_a"),
"chat_id": spec.get("chat_id", "chat_bot_a"),
"pov_summary": spec["pov_summary"],
"witness_you": spec.get("witness_you", 1),
"witness_host": spec.get("witness_host", 1),
"witness_guest": spec.get("witness_guest", 0),
"source": "direct",
"reliability": 1.0,
"significance": spec.get("significance", 1),
"pinned": 0,
"auto_pinned": 0,
}
append_event(conn, kind="memory_written", payload=payload)
project(conn)
def test_search_all_memories_returns_matches_across_owners(tmp_path):
"""Cross-owner: a single query must surface memories from every owner.
The per-owner ``owner_id = ?`` predicate that ``search_memories`` uses
is intentionally absent here, so a "rabbit" memory under ``bot_a`` and
one under ``bot_b`` should both come back from a single call.
"""
db = tmp_path / "t.db"
_seed(
db,
memory_specs=[
{
"owner_id": "bot_a",
"chat_id": "chat_bot_a",
"pov_summary": "the rabbit darted into the brambles",
},
{
"owner_id": "bot_b",
"chat_id": "chat_bot_b",
"pov_summary": "a white rabbit watched from the hedge",
},
# Distractor: must not appear for "rabbit".
{
"owner_id": "bot_a",
"chat_id": "chat_bot_a",
"pov_summary": "the kettle whistled",
},
],
)
with open_db(db) as conn:
out = search_all_memories(conn, query="rabbit")
owners = {row["owner_id"] for row in out}
assert owners == {"bot_a", "bot_b"}
assert len(out) == 2
# Returned shape contract.
for row in out:
assert set(row.keys()) >= {
"memory_id",
"owner_id",
"chat_id",
"scene_id",
"pov_summary",
"significance",
"ts",
"fts_rank",
}
def test_search_all_memories_orders_by_fts_rank(tmp_path):
"""Stronger BM25 match must come first (rank ASC = lower is better)."""
db = tmp_path / "t.db"
_seed(
db,
memory_specs=[
# Single occurrence -> weaker BM25 score.
{
"owner_id": "bot_a",
"chat_id": "chat_bot_a",
"pov_summary": "a rabbit appeared",
},
# Triple occurrence in a short row -> stronger BM25 score.
{
"owner_id": "bot_b",
"chat_id": "chat_bot_b",
"pov_summary": "rabbit rabbit rabbit",
},
],
)
with open_db(db) as conn:
out = search_all_memories(conn, query="rabbit", k=5)
assert len(out) == 2
# Stronger match first; fts_rank monotonically non-decreasing
# (lower-is-better, so ASC).
assert out[0]["pov_summary"] == "rabbit rabbit rabbit"
assert out[0]["fts_rank"] <= out[1]["fts_rank"]
def test_search_all_memories_respects_k_limit(tmp_path):
"""LIMIT ? must cap result count even when more matches exist."""
db = tmp_path / "t.db"
_seed(
db,
memory_specs=[
{
"owner_id": f"bot_{i}",
"chat_id": f"chat_{i}",
"pov_summary": f"rabbit sighting number {i}",
}
for i in range(10)
],
)
with open_db(db) as conn:
out = search_all_memories(conn, query="rabbit", k=3)
assert len(out) == 3
def test_search_all_memories_empty_query_returns_empty(tmp_path):
"""Empty / whitespace-only query must short-circuit to []."""
db = tmp_path / "t.db"
_seed(
db,
memory_specs=[
{
"owner_id": "bot_a",
"chat_id": "chat_bot_a",
"pov_summary": "the rabbit darted into the brambles",
},
],
)
with open_db(db) as conn:
assert search_all_memories(conn, query="") == []
assert search_all_memories(conn, query=" ") == []
+248
View File
@@ -0,0 +1,248 @@
"""Tests for Task 95 — delete-impact computation service (Phase 4).
`compute_delete_impact` walks event_log forward from a target event_id and
produces an :class:`ImpactReport` describing what would be removed if
rewind-to-target were invoked. It is a pure preview no database mutation.
T98's drawer surgical-delete UI uses this to render an "are you sure?"
modal before invoking the actual rewind path.
"""
from __future__ import annotations
from chat.db.connection import open_db
from chat.db.migrate import apply_migrations
from chat.eventlog.log import append_event
from chat.services.delete_impact import compute_delete_impact
def _seed_chat(conn) -> tuple[int, int]:
"""Append minimal bot + chat events; return their event ids."""
bot_id = append_event(
conn,
kind="bot_authored",
payload={
"id": "bot_a",
"name": "BotA",
"persona": "...",
"voice_samples": [],
"traits": [],
"backstory": "",
"initial_relationship_to_you": "",
"kickoff_prose": "",
},
)
chat_id = append_event(
conn,
kind="chat_created",
payload={
"id": "chat_bot_a",
"host_bot_id": "bot_a",
"initial_time": "2026-04-26T20:00:00+00:00",
"narrative_anchor": "Day 1",
"weather": "",
},
)
return bot_id, chat_id
def test_impact_for_simple_turn_lists_memory_and_edges(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
_seed_chat(conn)
user_id = append_event(
conn,
kind="user_turn",
payload={
"chat_id": "chat_bot_a",
"prose": "hey there friend",
"segments": [],
},
)
append_event(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": "Hi! Good to see you.",
"truncated": False,
"user_turn_id": user_id,
},
)
append_event(
conn,
kind="memory_written",
payload={
"owner_id": "bot_a",
"chat_id": "chat_bot_a",
"pov_summary": "You greeted me warmly today.",
"witness_you": 1,
"witness_host": 1,
"witness_guest": 0,
"source": "turn",
"reliability": 1.0,
"significance": 1,
"pinned": 0,
"auto_pinned": 0,
},
)
append_event(
conn,
kind="edge_update",
payload={
"source_id": "you",
"target_id": "bot_a",
"affinity_delta": 0.1,
},
)
report = compute_delete_impact(conn, target_event_id=user_id)
assert report.target_event_id == user_id
kinds = [item.kind for item in report.cascading]
# Walk from user_turn forward — user_turn, assistant_turn,
# memory_written, edge_update should all be in scope, in order.
assert kinds == [
"user_turn",
"assistant_turn",
"memory_written",
"edge_update",
]
# Memory description includes the pov_summary excerpt.
mem_item = report.cascading[2]
assert "memory:" in mem_item.description
assert "greeted" in mem_item.description
# Edge description includes both endpoints.
edge_item = report.cascading[3]
assert "you" in edge_item.description
assert "bot_a" in edge_item.description
assert edge_item.target_id == "you->bot_a"
# Notes mentions total count.
assert any("4 events" in n for n in report.notes)
def test_impact_for_scene_opening_turn_warns_about_subsequent(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
_seed_chat(conn)
early_id = append_event(
conn,
kind="user_turn",
payload={"chat_id": "chat_bot_a", "prose": "the start", "segments": []},
)
append_event(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": "ok",
"truncated": False,
"user_turn_id": early_id,
},
)
append_event(
conn,
kind="scene_closed",
payload={
"scene_id": 1,
"closed_at": "2026-04-26T21:00:00+00:00",
"significance": 2,
},
)
report = compute_delete_impact(conn, target_event_id=early_id)
# Scene-close warning fires when one is in scope.
assert any("scene close" in n.lower() for n in report.notes)
# The scene_closed event also appears as a cascading item.
assert any(item.kind == "scene_closed" for item in report.cascading)
def test_impact_for_missing_event_returns_empty_with_note(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
_seed_chat(conn)
report = compute_delete_impact(conn, target_event_id=999_999)
assert report.cascading == []
assert any("not found" in n for n in report.notes)
def test_impact_does_not_mutate_database(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
_seed_chat(conn)
user_id = append_event(
conn,
kind="user_turn",
payload={"chat_id": "chat_bot_a", "prose": "hi", "segments": []},
)
append_event(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": "hello",
"truncated": False,
"user_turn_id": user_id,
},
)
# Snapshot all event_log rows as a tuple-of-tuples.
before = conn.execute(
"SELECT id, branch_id, ts, kind, payload_json, superseded_by, "
"hidden FROM event_log ORDER BY id"
).fetchall()
compute_delete_impact(conn, target_event_id=user_id)
after = conn.execute(
"SELECT id, branch_id, ts, kind, payload_json, superseded_by, "
"hidden FROM event_log ORDER BY id"
).fetchall()
# Byte-identical: nothing inserted, deleted, or updated.
assert before == after
def test_impact_includes_regenerated_from_warning(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
_seed_chat(conn)
original_id = append_event(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": "first try",
"truncated": False,
"user_turn_id": 0,
},
)
regen_id = append_event(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": "second try",
"truncated": False,
"user_turn_id": 0,
"regenerated_from": original_id,
},
)
report = compute_delete_impact(conn, target_event_id=regen_id)
# The regenerated_from note carries the original event id so the user
# knows the original turn isn't lost.
assert any("regenerated from" in n for n in report.notes)
assert any(str(original_id) in n for n in report.notes)
+523
View File
@@ -0,0 +1,523 @@
"""T98 (Phase 4): drawer phase-4 bundle.
Five sub-features extending the chat drawer:
* T98.1 branching UI (create / switch / from-turn).
* T98.2 significance-review panel (distribution + significance edits).
* T98.3 hide-from-view toggle (per-turn, via ``manual_edit`` projector
branch ``turn_hidden``).
* T98.4 surgical delete with cascade preview (preview modal +
rewind execution against a target turn).
* T98.5 remaining v1 edits (chat narrative_anchor + weather).
Tests follow the T59 pattern in ``tests/test_drawer_events_threads_skip.py``
a TestClient against the real FastAPI app with a per-test temp DB.
"""
from __future__ import annotations
from pathlib import Path
import pytest
from fastapi.testclient import TestClient
from chat.app import app
from chat.db.connection import open_db
from chat.eventlog.log import append_and_apply, append_event
from chat.eventlog.projector import project
@pytest.fixture
def client(tmp_path, monkeypatch):
cfg = tmp_path / "config.toml"
cfg.write_text('featherless_api_key = "test"\n')
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
db = tmp_path / "test.db"
monkeypatch.setenv("CHAT_DB_PATH", str(db))
with TestClient(app) as c:
if hasattr(app.state, "background_worker"):
app.state.background_worker.enabled = False
yield c
def _bot_payload(bot_id: str, name: str) -> dict:
return {
"id": bot_id,
"name": name,
"persona": "...",
"voice_samples": [],
"traits": [],
"backstory": "",
"initial_relationship_to_you": "",
"kickoff_prose": "",
}
def _seed_chat(db: Path, *, with_scene: bool = True) -> int:
"""Seed a chat hosted by ``bot_a``; return the latest event id (chat_created)."""
with open_db(db) as conn:
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_a", "BotA"))
append_event(
conn,
kind="you_authored",
payload={"name": "Me", "pronouns": "they/them", "persona": ""},
)
chat_event_id = append_event(
conn,
kind="chat_created",
payload={
"id": "chat_bot_a",
"host_bot_id": "bot_a",
"initial_time": "2026-04-26T20:00:00+00:00",
"narrative_anchor": "Day 1",
"weather": "",
},
)
if with_scene:
append_event(
conn,
kind="scene_opened",
payload={
"chat_id": "chat_bot_a",
"container_id": None,
"started_at": "2026-04-26T20:00:00+00:00",
"participants": ["you", "bot_a"],
},
)
project(conn)
return chat_event_id
# ---------------------------------------------------------------------------
# T98.1 — branching UI.
# ---------------------------------------------------------------------------
def test_t98_1_create_branch_emits_branch_created_and_renders(client, tmp_path):
db = tmp_path / "test.db"
seed_id = _seed_chat(db)
response = client.post(
"/chats/chat_bot_a/drawer/branch/create",
data={"name": "experiment_a", "origin_event_id": str(seed_id)},
)
assert response.status_code == 200
with open_db(db) as conn:
rows = conn.execute(
"SELECT COUNT(*) FROM event_log WHERE kind = 'branch_created'"
).fetchone()
assert rows[0] == 1
from chat.state.branches import get_branch
b = get_branch(conn, "experiment_a")
assert b is not None
assert b["origin_event_id"] == seed_id
assert b["chat_id"] == "chat_bot_a"
# Drawer partial lists the new branch.
body = response.text
assert "<h3>Branches</h3>" in body
assert "experiment_a" in body
def test_t98_1_switch_branch_marks_active_and_unknown_400s(client, tmp_path):
db = tmp_path / "test.db"
seed_id = _seed_chat(db)
# Create branch directly via the service so this test focuses on switch.
with open_db(db) as conn:
from chat.services.branching import branch_from_event
branch_from_event(
conn, name="experiment_b", origin_event_id=seed_id, chat_id="chat_bot_a"
)
response = client.post(
"/chats/chat_bot_a/drawer/branch/switch",
data={"name": "experiment_b"},
)
assert response.status_code == 200
with open_db(db) as conn:
from chat.state.branches import active_branch
active = active_branch(conn)
assert active is not None
assert active["name"] == "experiment_b"
# Unknown branch -> 400.
bad = client.post(
"/chats/chat_bot_a/drawer/branch/switch",
data={"name": "ghost_branch"},
)
assert bad.status_code == 400
def test_t98_1_branch_from_turn_emits_branch_created(client, tmp_path):
db = tmp_path / "test.db"
seed_id = _seed_chat(db)
# Append an extra turn so we can branch from it specifically.
with open_db(db) as conn:
turn_id = append_event(
conn,
kind="user_turn",
payload={"chat_id": "chat_bot_a", "prose": "hi", "segments": []},
)
response = client.post(
f"/chats/chat_bot_a/drawer/branch/from-turn/{turn_id}",
data={"name": "fork_at_turn"},
)
assert response.status_code == 200
with open_db(db) as conn:
from chat.state.branches import get_branch
b = get_branch(conn, "fork_at_turn")
assert b is not None
assert b["origin_event_id"] == turn_id
assert b["chat_id"] == "chat_bot_a"
# Duplicate name -> 400 from service ValueError.
dup = client.post(
f"/chats/chat_bot_a/drawer/branch/from-turn/{turn_id}",
data={"name": "fork_at_turn"},
)
assert dup.status_code == 400
assert seed_id < turn_id # sanity: turn is after chat_created
# ---------------------------------------------------------------------------
# T98.2 — significance review panel.
# ---------------------------------------------------------------------------
def _seed_memories_for_significance(db: Path) -> list[int]:
"""Seed three memories with significance levels 0, 1, 2. Returns ids.
Uses ``append_and_apply`` (vs ``append_event`` + a final ``project``)
so each row is applied exactly once the chat row was already
materialised by ``_seed_chat`` and a re-projection would conflict
on ``chats.id`` UNIQUE.
"""
ids: list[int] = []
with open_db(db) as conn:
for sig in (0, 1, 2):
append_and_apply(
conn,
kind="memory_written",
payload={
"owner_id": "bot_a",
"chat_id": "chat_bot_a",
"pov_summary": f"memory at significance {sig}",
"witness_you": 1,
"witness_host": 1,
"witness_guest": 0,
"significance": sig,
},
)
rows = conn.execute(
"SELECT id FROM memories WHERE chat_id = 'chat_bot_a' "
"ORDER BY id ASC"
).fetchall()
ids = [int(r[0]) for r in rows]
return ids
def test_t98_2_distribution_renders_per_significance_bucket(client, tmp_path):
db = tmp_path / "test.db"
_seed_chat(db)
_seed_memories_for_significance(db)
response = client.get("/chats/chat_bot_a/drawer")
assert response.status_code == 200
body = response.text
# Section heading + bar entries for each significance level.
assert "<h3>Significance review</h3>" in body
# All four buckets appear by their canonical label even when count=0.
assert ">★★ (3)<" in body or "(3)" in body
# The distribution markup names each level explicitly.
for level in (0, 1, 2, 3):
assert f"sig-bar sig-{level}" in body
# Three seeded memories (sigs 0, 1, 2) — each has a count = 1 bar.
# We don't pin exact text formatting, just verify the per-level bars
# are present.
def test_t98_2_edit_significance_via_existing_route_lands_manual_edit(
client, tmp_path
):
db = tmp_path / "test.db"
_seed_chat(db)
ids = _seed_memories_for_significance(db)
target_id = ids[0] # initially significance=0
response = client.post(
f"/chats/chat_bot_a/drawer/memory/{target_id}/significance",
data={"significance": "3"},
)
assert response.status_code == 200
with open_db(db) as conn:
# Significance updated in the projected table.
row = conn.execute(
"SELECT significance FROM memories WHERE id = ?", (target_id,)
).fetchone()
assert int(row[0]) == 3
# manual_edit landed in the event log with the prior snapshot.
import json as _json
log_rows = conn.execute(
"SELECT payload_json FROM event_log "
"WHERE kind = 'manual_edit' ORDER BY id DESC LIMIT 1"
).fetchone()
payload = _json.loads(log_rows[0])
assert payload["target_kind"] == "memory_significance"
assert int(payload["target_id"]) == target_id
assert payload["prior_value"] == 0
assert payload["new_value"] == 3
# ---------------------------------------------------------------------------
# T98.3 — hide-from-view toggle.
# ---------------------------------------------------------------------------
def _seed_turns(db: Path) -> tuple[int, int]:
"""Append one user_turn + one assistant_turn; return their event ids."""
with open_db(db) as conn:
user_id = append_and_apply(
conn,
kind="user_turn",
payload={
"chat_id": "chat_bot_a",
"prose": "How are you doing today?",
"segments": [],
},
)
bot_id = append_and_apply(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": "Quite well, thanks for asking!",
"truncated": False,
"user_turn_id": user_id,
},
)
return user_id, bot_id
def test_t98_3_hide_turn_flips_event_log_hidden_via_manual_edit(
client, tmp_path
):
db = tmp_path / "test.db"
_seed_chat(db)
user_id, bot_id = _seed_turns(db)
response = client.post(
f"/chats/chat_bot_a/drawer/turn/hide/{user_id}",
data={"hidden": "1"},
)
assert response.status_code == 200
with open_db(db) as conn:
# event_log.hidden flipped to 1.
row = conn.execute(
"SELECT hidden FROM event_log WHERE id = ?", (user_id,)
).fetchone()
assert int(row[0]) == 1
# manual_edit landed with the prior snapshot.
import json as _json
log = conn.execute(
"SELECT payload_json FROM event_log "
"WHERE kind = 'manual_edit' ORDER BY id DESC LIMIT 1"
).fetchone()
payload = _json.loads(log[0])
assert payload["target_kind"] == "turn_hidden"
assert int(payload["target_id"]) == user_id
assert payload["prior_value"] == {"hidden": 0}
assert payload["new_value"] == {"hidden": 1}
def test_t98_3_hidden_turn_disappears_from_read_recent_dialogue(
client, tmp_path
):
"""Hiding a turn must drop it from the prompt-window read.
``read_recent_dialogue`` (chat.services.turn_common) filters
``hidden = 0`` server-side, so flipping the flag via the drawer
route must surface immediately.
"""
db = tmp_path / "test.db"
_seed_chat(db)
user_id, bot_id = _seed_turns(db)
# Sanity baseline — both turns visible before the hide.
with open_db(db) as conn:
from chat.services.turn_common import read_recent_dialogue
before = read_recent_dialogue(conn, "chat_bot_a", limit=10)
before_ids = [t["event_id"] for t in before]
assert user_id in before_ids
assert bot_id in before_ids
# Hide the user turn via the drawer route.
response = client.post(
f"/chats/chat_bot_a/drawer/turn/hide/{user_id}",
data={"hidden": "1"},
)
assert response.status_code == 200
with open_db(db) as conn:
from chat.services.turn_common import read_recent_dialogue
after = read_recent_dialogue(conn, "chat_bot_a", limit=10)
after_ids = [t["event_id"] for t in after]
assert user_id not in after_ids
assert bot_id in after_ids # the unhidden bot turn still surfaces
# ---------------------------------------------------------------------------
# T98.4 — surgical delete with cascade preview.
# ---------------------------------------------------------------------------
def test_t98_4_delete_preview_returns_impact_report_html(client, tmp_path):
db = tmp_path / "test.db"
_seed_chat(db)
user_id, bot_id = _seed_turns(db)
response = client.get(
f"/chats/chat_bot_a/drawer/turn/delete-preview/{user_id}"
)
assert response.status_code == 200
body = response.text
# Modal markup with the event id and the cascade list.
assert "delete-impact-modal" in body
assert f"Delete event {user_id}?" in body
assert "delete-impact-cascade" in body
# Both turns ride along in the cascade — user_turn at user_id, then
# the assistant_turn at bot_id (>= user_id).
assert "user_turn" in body
assert "assistant_turn" in body
# Confirm-form posts to the delete route.
assert f"/drawer/turn/delete/{user_id}" in body
def test_t98_4_delete_invokes_rewind_and_drops_cascade(client, tmp_path):
db = tmp_path / "test.db"
_seed_chat(db)
user_id, bot_id = _seed_turns(db)
# Append a third turn after the assistant_turn so we can verify the
# cascade catches everything from user_id forward.
with open_db(db) as conn:
extra_id = append_and_apply(
conn,
kind="user_turn",
payload={
"chat_id": "chat_bot_a",
"prose": "follow-up",
"segments": [],
},
)
# Sanity: all three turn rows exist.
with open_db(db) as conn:
turn_count = conn.execute(
"SELECT COUNT(*) FROM event_log "
"WHERE kind IN ('user_turn', 'assistant_turn')"
).fetchone()[0]
assert turn_count == 3
# Delete from user_id forward.
response = client.post(f"/chats/chat_bot_a/drawer/turn/delete/{user_id}")
assert response.status_code == 200
# All three turns are gone — the rewind truncated the log past
# user_id - 1, removing user_id, bot_id, and extra_id.
with open_db(db) as conn:
turn_count = conn.execute(
"SELECT COUNT(*) FROM event_log "
"WHERE kind IN ('user_turn', 'assistant_turn')"
).fetchone()[0]
assert turn_count == 0
for ev_id in (user_id, bot_id, extra_id):
row = conn.execute(
"SELECT 1 FROM event_log WHERE id = ?", (ev_id,)
).fetchone()
assert row is None, f"event {ev_id} should have been deleted"
# ---------------------------------------------------------------------------
# T98.5 — remaining v1 edits (chat narrative anchor + weather).
# ---------------------------------------------------------------------------
def test_t98_5_edit_chat_narrative_anchor_emits_manual_edit(client, tmp_path):
db = tmp_path / "test.db"
_seed_chat(db)
response = client.post(
"/chats/chat_bot_a/drawer/chat/narrative-anchor",
data={"new_value": "Late evening, after dinner"},
)
assert response.status_code == 200
with open_db(db) as conn:
row = conn.execute(
"SELECT narrative_anchor FROM chat_state WHERE chat_id = ?",
("chat_bot_a",),
).fetchone()
assert row[0] == "Late evening, after dinner"
import json as _json
log = conn.execute(
"SELECT payload_json FROM event_log "
"WHERE kind = 'manual_edit' ORDER BY id DESC LIMIT 1"
).fetchone()
payload = _json.loads(log[0])
assert payload["target_kind"] == "chat_narrative_anchor"
assert payload["target_id"] == "chat_bot_a"
assert payload["prior_value"] == "Day 1"
assert payload["new_value"] == "Late evening, after dinner"
def test_t98_5_edit_chat_weather_emits_manual_edit(client, tmp_path):
db = tmp_path / "test.db"
_seed_chat(db)
response = client.post(
"/chats/chat_bot_a/drawer/chat/weather",
data={"new_value": "thunderstorm rolling in"},
)
assert response.status_code == 200
with open_db(db) as conn:
row = conn.execute(
"SELECT weather FROM chat_state WHERE chat_id = ?",
("chat_bot_a",),
).fetchone()
assert row[0] == "thunderstorm rolling in"
import json as _json
log = conn.execute(
"SELECT payload_json FROM event_log "
"WHERE kind = 'manual_edit' ORDER BY id DESC LIMIT 1"
).fetchone()
payload = _json.loads(log[0])
assert payload["target_kind"] == "chat_weather"
assert payload["target_id"] == "chat_bot_a"
assert payload["prior_value"] == ""
assert payload["new_value"] == "thunderstorm rolling in"
+185
View File
@@ -0,0 +1,185 @@
"""Embedding worker (T97, Phase 4).
The worker drains a queue of EmbeddingJobs and emits ``embedding_indexed``
events. Mirrors test_significance.py's BackgroundWorker tests in shape:
seed a memory, enqueue jobs, call ``stop()`` to drain via sentinel, then
assert on the projected ``embeddings`` table and the underlying event_log.
"""
from __future__ import annotations
from pathlib import Path
from chat.db.connection import open_db
from chat.db.migrate import apply_migrations
from chat.eventlog.log import append_event
from chat.eventlog.projector import project
from chat.services.embedding_worker import EmbeddingJob, EmbeddingWorker
from chat.services.embeddings import (
DEFAULT_EMBEDDING_MODEL,
EmbeddingResult,
FALLBACK_EMBEDDING_MODEL,
)
# Trigger handler registration for projection.
import chat.state.embeddings # noqa: F401
import chat.state.entities # noqa: F401
import chat.state.memory # noqa: F401
import chat.state.world # noqa: F401
def _seed_memories(db_path: Path, count: int) -> list[int]:
"""Seed ``count`` memory rows for ``bot_a`` and return their ids."""
with open_db(db_path) as conn:
append_event(
conn,
kind="bot_authored",
payload={
"id": "bot_a",
"name": "BotA",
"persona": "...",
"voice_samples": [],
"traits": [],
"backstory": "",
"initial_relationship_to_you": "",
"kickoff_prose": "",
},
)
append_event(
conn,
kind="chat_created",
payload={
"id": "chat_bot_a",
"host_bot_id": "bot_a",
"initial_time": "2026-04-26T20:00:00+00:00",
"narrative_anchor": "Day 1",
"weather": "",
},
)
for i in range(count):
append_event(
conn,
kind="memory_written",
payload={
"owner_id": "bot_a",
"chat_id": "chat_bot_a",
"pov_summary": f"memory text {i}",
"witness_you": 1,
"witness_host": 1,
"witness_guest": 0,
"source": "direct",
"reliability": 1.0,
"significance": 1,
"pinned": 0,
"auto_pinned": 0,
},
)
project(conn)
return [
r[0]
for r in conn.execute(
"SELECT id FROM memories WHERE owner_id = 'bot_a' ORDER BY id"
).fetchall()
]
async def test_worker_drains_jobs_and_emits_indexed_events(tmp_path):
"""Three jobs in -> three ``embedding_indexed`` events out, all
projected into the ``embeddings`` table."""
db = tmp_path / "t.db"
apply_migrations(db)
memory_ids = _seed_memories(db, count=3)
worker = EmbeddingWorker(
conn_factory=lambda: open_db(db),
client=None, # pseudo path — no client needed
)
await worker.start()
for mid in memory_ids:
worker.enqueue(EmbeddingJob(memory_id=mid, text=f"text-{mid}"))
await worker.stop()
with open_db(db) as conn:
# Three embedding_indexed events landed.
cur = conn.execute(
"SELECT COUNT(*) FROM event_log WHERE kind = 'embedding_indexed'"
)
assert cur.fetchone()[0] == 3
# Three rows in the embeddings table — one per memory.
cur = conn.execute(
"SELECT memory_id, model, dim FROM embeddings ORDER BY memory_id"
)
rows = cur.fetchall()
assert len(rows) == 3
for (mid, model, dim), expected_mid in zip(rows, memory_ids):
assert mid == expected_mid
assert model == DEFAULT_EMBEDDING_MODEL
assert dim > 0
async def test_worker_skips_fallback_results(tmp_path, monkeypatch):
"""A fallback EmbeddingResult must NOT produce an embedding_indexed
event backfill can retry later when a real embedding is available."""
db = tmp_path / "t.db"
apply_migrations(db)
memory_ids = _seed_memories(db, count=1)
async def _fake_generate(client, *, text, model, dim, timeout_s=30.0):
return EmbeddingResult(
vector=[0.0] * dim, model=FALLBACK_EMBEDDING_MODEL, dim=dim
)
# Patch the symbol the worker resolved at import time.
import chat.services.embedding_worker as worker_mod
monkeypatch.setattr(worker_mod, "generate_embedding", _fake_generate)
worker = EmbeddingWorker(
conn_factory=lambda: open_db(db),
client=None,
)
await worker.start()
worker.enqueue(EmbeddingJob(memory_id=memory_ids[0], text="anything"))
await worker.stop()
with open_db(db) as conn:
cur = conn.execute(
"SELECT COUNT(*) FROM event_log WHERE kind = 'embedding_indexed'"
)
assert cur.fetchone()[0] == 0
cur = conn.execute("SELECT COUNT(*) FROM embeddings")
assert cur.fetchone()[0] == 0
async def test_worker_handles_concurrent_jobs_serially(tmp_path):
"""Five jobs queued back-to-back must process in FIFO order — the
single-task design respects the Featherless 2-conn cap (and keeps
event_log ordering deterministic)."""
db = tmp_path / "t.db"
apply_migrations(db)
memory_ids = _seed_memories(db, count=5)
worker = EmbeddingWorker(
conn_factory=lambda: open_db(db),
client=None,
)
await worker.start()
# Enqueue all five before yielding to the loop — exercises the queue
# rather than a one-at-a-time drain.
for mid in memory_ids:
worker.enqueue(EmbeddingJob(memory_id=mid, text=f"text-{mid}"))
await worker.stop()
with open_db(db) as conn:
# Events landed in enqueue order (FIFO).
cur = conn.execute(
"SELECT json_extract(payload_json, '$.memory_id') "
"FROM event_log WHERE kind = 'embedding_indexed' "
"ORDER BY id"
)
seen = [r[0] for r in cur.fetchall()]
assert seen == memory_ids
# All five embeddings projected.
cur = conn.execute("SELECT COUNT(*) FROM embeddings")
assert cur.fetchone()[0] == 5
+91
View File
@@ -0,0 +1,91 @@
"""Tests for the embedding generation service (T91, Phase 4).
Phase 4's first cut ships a deterministic local pseudo-embedding so the
vector retrieval pipeline can land without an external embeddings API
or a heavy local model dependency. These tests pin the contract:
* the result has the right shape (vector length, ``dim`` metadata),
* the default ``model`` string is reported back unchanged,
* output is byte-identical for the same input (deterministic),
* distinct inputs produce distinct vectors (so cosine actually
discriminates),
* empty / whitespace-only input collapses to the ``"fallback"`` sentinel
with a zero vector callers detect this and skip indexing,
* the vector is unit-normalized so cosine similarity behaves.
The pseudo path doesn't touch the LLMClient, so we pass an empty
``MockLLMClient`` any accidental call into it would raise
``IndexError`` and surface as a regression.
"""
from __future__ import annotations
import math
import pytest
from chat.llm.mock import MockLLMClient
from chat.services.embeddings import (
DEFAULT_EMBEDDING_DIM,
DEFAULT_EMBEDDING_MODEL,
FALLBACK_EMBEDDING_MODEL,
EmbeddingResult,
generate_embedding,
)
def _client() -> MockLLMClient:
# Pseudo path never calls the client — empty canned list ensures any
# accidental call raises and surfaces the regression loudly.
return MockLLMClient(canned=[])
@pytest.mark.asyncio
async def test_generate_embedding_returns_vector_of_correct_dim():
result = await generate_embedding(_client(), text="hello")
assert isinstance(result, EmbeddingResult)
assert isinstance(result.vector, list)
assert len(result.vector) == DEFAULT_EMBEDDING_DIM == 384
assert result.dim == 384
assert all(isinstance(x, float) for x in result.vector)
@pytest.mark.asyncio
async def test_generate_embedding_returns_correct_model_metadata():
result = await generate_embedding(_client(), text="hello")
assert result.model == DEFAULT_EMBEDDING_MODEL == "pseudo-sha256-384"
@pytest.mark.asyncio
async def test_generate_embedding_is_deterministic():
a = await generate_embedding(_client(), text="hello world")
b = await generate_embedding(_client(), text="hello world")
assert a.vector == b.vector
@pytest.mark.asyncio
async def test_generate_embedding_distinct_text_produces_distinct_vectors():
a = await generate_embedding(_client(), text="hello world")
b = await generate_embedding(_client(), text="totally different content")
assert a.vector != b.vector
# Sanity-check cosine similarity — both vectors are unit-normalized,
# so this reduces to a plain dot product.
cosine = sum(x * y for x, y in zip(a.vector, b.vector))
assert cosine < 0.99
@pytest.mark.asyncio
async def test_generate_embedding_empty_text_returns_fallback():
for empty in ("", " ", "\n\t"):
result = await generate_embedding(_client(), text=empty)
assert result.model == FALLBACK_EMBEDDING_MODEL == "fallback"
assert result.dim == DEFAULT_EMBEDDING_DIM
assert len(result.vector) == DEFAULT_EMBEDDING_DIM
assert all(x == 0.0 for x in result.vector)
@pytest.mark.asyncio
async def test_generate_embedding_unit_normalized():
result = await generate_embedding(_client(), text="some non-empty text")
norm_sq = sum(x * x for x in result.vector)
assert math.isclose(norm_sq, 1.0, abs_tol=1e-6)
+218
View File
@@ -0,0 +1,218 @@
from __future__ import annotations
from chat.db.connection import open_db
from chat.db.migrate import apply_migrations
from chat.eventlog.log import append_event
from chat.eventlog.projector import project
import chat.state.memory # registers memory_written handler
import chat.state.embeddings # registers embedding handlers
from chat.state.embeddings import get_embedding, list_embeddings_for_owner
def _base_memory(**overrides):
payload = {
"owner_id": "bot_a",
"chat_id": "chat_bot_a",
"scene_id": 1,
"pov_summary": "She laughed at his joke about owls.",
"witness_you": 1,
"witness_host": 1,
"witness_guest": 0,
"chat_clock_at": "2026-04-26T10:00:00",
"source": "direct",
"reliability": 1.0,
"significance": 1,
"pinned": 0,
"auto_pinned": 0,
}
payload.update(overrides)
return payload
def _vec(n: int = 384, base: float = 0.1) -> list[float]:
"""Return a length-n float vector with predictable values for assertions."""
return [round(base + i * 0.001, 6) for i in range(n)]
def test_embedding_indexed_inserts_row(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
append_event(conn, kind="memory_written", payload=_base_memory())
project(conn)
memory_id = conn.execute("SELECT id FROM memories").fetchone()[0]
vector = _vec(384, base=0.1)
append_event(
conn,
kind="embedding_indexed",
payload={
"memory_id": memory_id,
"vector": vector,
"model": "test-model",
"dim": 384,
},
)
project(conn)
emb = get_embedding(conn, memory_id)
assert emb is not None
assert emb["memory_id"] == memory_id
assert emb["vector"] == vector
assert emb["model"] == "test-model"
assert emb["dim"] == 384
assert emb["indexed_at"] is not None
def test_embedding_deindexed_removes_row(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
append_event(conn, kind="memory_written", payload=_base_memory())
project(conn)
memory_id = conn.execute("SELECT id FROM memories").fetchone()[0]
append_event(
conn,
kind="embedding_indexed",
payload={
"memory_id": memory_id,
"vector": _vec(),
"model": "test-model",
"dim": 384,
},
)
project(conn)
assert get_embedding(conn, memory_id) is not None
append_event(
conn,
kind="embedding_deindexed",
payload={"memory_id": memory_id},
)
project(conn)
assert get_embedding(conn, memory_id) is None
def test_embedding_indexed_replaces_existing(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
append_event(conn, kind="memory_written", payload=_base_memory())
project(conn)
memory_id = conn.execute("SELECT id FROM memories").fetchone()[0]
vec_a = _vec(384, base=0.1)
vec_b = _vec(384, base=0.5)
append_event(
conn,
kind="embedding_indexed",
payload={
"memory_id": memory_id,
"vector": vec_a,
"model": "test-model",
"dim": 384,
},
)
project(conn)
first = get_embedding(conn, memory_id)
assert first is not None
assert first["vector"] == vec_a
append_event(
conn,
kind="embedding_indexed",
payload={
"memory_id": memory_id,
"vector": vec_b,
"model": "test-model",
"dim": 384,
},
)
project(conn)
second = get_embedding(conn, memory_id)
assert second is not None
assert second["vector"] == vec_b
# Still exactly one row for this memory.
count = conn.execute(
"SELECT COUNT(*) FROM embeddings WHERE memory_id = ?", (memory_id,)
).fetchone()[0]
assert count == 1
def test_list_embeddings_for_owner_returns_joined_rows(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
# Two memories for bot_a, one for bot_b.
append_event(
conn,
kind="memory_written",
payload=_base_memory(
owner_id="bot_a",
pov_summary="Alpha memory.",
significance=2,
),
)
append_event(
conn,
kind="memory_written",
payload=_base_memory(
owner_id="bot_a",
pov_summary="Beta memory.",
significance=3,
),
)
append_event(
conn,
kind="memory_written",
payload=_base_memory(
owner_id="bot_b",
pov_summary="Gamma memory.",
significance=1,
),
)
project(conn)
rows = conn.execute(
"SELECT id, owner_id FROM memories ORDER BY id"
).fetchall()
# Index every memory with a distinct vector so we can check ordering.
for i, (mid, _owner) in enumerate(rows):
append_event(
conn,
kind="embedding_indexed",
payload={
"memory_id": mid,
"vector": _vec(384, base=0.1 * (i + 1)),
"model": "test-model",
"dim": 384,
},
)
project(conn)
a_rows = list_embeddings_for_owner(conn, "bot_a")
assert len(a_rows) == 2
summaries = {r["pov_summary"] for r in a_rows}
assert summaries == {"Alpha memory.", "Beta memory."}
sigs = {r["significance"] for r in a_rows}
assert sigs == {2, 3}
for r in a_rows:
assert r["model"] == "test-model"
assert r["dim"] == 384
assert isinstance(r["vector"], list)
assert len(r["vector"]) == 384
assert r["witness_you"] == 1
assert r["witness_host"] == 1
assert r["witness_guest"] == 0
b_rows = list_embeddings_for_owner(conn, "bot_b")
assert len(b_rows) == 1
assert b_rows[0]["pov_summary"] == "Gamma memory."
def test_get_embedding_returns_none_when_missing(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
assert get_embedding(conn, 999) is None
+214
View File
@@ -16,6 +16,7 @@ from chat.eventlog.log import append_event
from chat.eventlog.projector import project
from chat.state.memory import search_memories
import chat.state.memory # noqa: F401 (registers memory_written handler)
import chat.state.embeddings # noqa: F401 (registers embedding_indexed handler)
def _seed(db, *, memory_specs):
@@ -159,3 +160,216 @@ def test_significance_bias_is_constant_module_level():
# Must be non-negative -- a negative bias would invert the desired
# "higher significance ranks higher" semantics.
assert SIGNIFICANCE_RANK_BIAS >= 0
# ---------------------------------------------------------------------------
# T96 (Phase 4): combined FTS + vector retrieval ranking via reciprocal-rank
# fusion. The fused path activates only when ``query_vector`` is provided —
# the no-vector path (above) is unchanged.
# ---------------------------------------------------------------------------
def _one_hot(dim: int, idx: int) -> list[float]:
v = [0.0] * dim
v[idx] = 1.0
return v
def _seed_memories_with_optional_embeddings(db, *, memory_specs):
"""Like ``_seed`` but also projects ``embedding_indexed`` events for any
spec carrying a ``vector`` key.
Memory rows are assigned ids in the order their ``memory_written`` events
were appended (the ``memories.id`` column is an autoincrementing primary
key), so we predict ``memory_id = i + 1`` per spec and append both kinds
of events back-to-back BEFORE projecting. Projecting only once keeps the
INSERT-based ``memory_written`` handler from duplicating rows.
"""
apply_migrations(db)
with open_db(db) as conn:
# First pass: append every memory_written event in order. The DB
# assigns autoincrementing ids 1..N matching the order of these
# events, so we can pair vectors to memories by index.
for spec in memory_specs:
payload = {
"owner_id": spec.get("owner_id", "bot_a"),
"chat_id": spec.get("chat_id", "chat_bot_a"),
"pov_summary": spec["pov_summary"],
"witness_you": spec.get("witness_you", 1),
"witness_host": spec.get("witness_host", 1),
"witness_guest": spec.get("witness_guest", 0),
"source": "direct",
"reliability": 1.0,
"significance": spec.get("significance", 1),
"pinned": 0,
"auto_pinned": 0,
}
append_event(conn, kind="memory_written", payload=payload)
# Second pass: append embedding_indexed events for any spec that
# supplied a vector, using the predicted memory id.
for i, spec in enumerate(memory_specs, start=1):
if "vector" not in spec:
continue
vec = spec["vector"]
append_event(
conn,
kind="embedding_indexed",
payload={
"memory_id": i,
"vector": list(vec),
"model": "test-model",
"dim": len(vec),
},
)
# Single projection — avoids the memory_written handler INSERTing
# the same row twice on a re-projection.
project(conn)
def test_search_memories_without_query_vector_uses_fts_only(tmp_path):
"""Regression: omitting ``query_vector`` keeps the existing FTS-only path.
Identical seed to ``test_search_higher_significance_ranks_above_lower``
but pinned to the no-vector code path explicitly (no kwarg passed).
"""
db = tmp_path / "t.db"
_seed(
db,
memory_specs=[
{"pov_summary": "small promise"},
{"pov_summary": "huge promise"},
{"pov_summary": "tiny promise", "significance": 3},
],
)
with open_db(db) as conn:
out = search_memories(conn, "bot_a", "host", "promise", k=3)
assert len(out) == 3
# The composite re-rank surfaces the high-significance row first.
assert out[0]["pov_summary"] == "tiny promise"
# Sanity: the row shape still carries ``fts_rank`` + ``composite_score``
# like the FTS-only path always has.
assert "fts_rank" in out[0]
assert "composite_score" in out[0]
def test_search_memories_with_query_vector_includes_vector_hits(tmp_path):
"""RRF fuses FTS hits with vector hits — both kinds surface in the result.
Memory 1 only matches FTS (keyword "rabbit", embedding far from query).
Memory 2 only matches the vector (embedding identical to query, no
keyword overlap). Memories 3-5 are unrelated. The fused top-K must
contain BOTH memory 1 and memory 2.
"""
db = tmp_path / "t.db"
dim = 8
# Query vector = one-hot at index 0. Memory 2 mirrors it exactly. The
# FTS-only memory (memory 1) has NO embedding so it cannot leak into
# the vector ranking; the filler memories (3-5) likewise have no
# embeddings, so the vector ranking returns memory 2 alone.
query_vec = _one_hot(dim, 0)
_seed_memories_with_optional_embeddings(
db,
memory_specs=[
# Memory 1: FTS-only match. No embedding indexed.
{"pov_summary": "rabbit hopped over the fence"},
# Memory 2: vector-only match. No keyword overlap with "rabbit".
{
"pov_summary": "completely unrelated narrative line",
"vector": _one_hot(dim, 0),
},
# Memories 3-5: filler, irrelevant to both channels.
{"pov_summary": "lighthouse keeper polished the lens"},
{"pov_summary": "they discussed cartography for hours"},
{"pov_summary": "she taught him semaphore signals"},
],
)
with open_db(db) as conn:
out = search_memories(
conn,
"bot_a",
"host",
"rabbit",
k=4,
query_vector=query_vec,
)
summaries = [r["pov_summary"] for r in out]
# FTS-only candidate (memory 1) made it through.
assert "rabbit hopped over the fence" in summaries
# Vector-only candidate (memory 2) also made it through despite
# having no keyword overlap with the query string.
assert "completely unrelated narrative line" in summaries
def test_search_memories_fusion_significance_bias_still_applies(tmp_path):
"""With two RRF-tied candidates, the higher-significance one ranks first.
Two memories share the keyword "promise" AND share an identical
embedding to the query so their FTS rank and vector rank are both
ties. RRF gives them the same fusion score. The Python-side
significance + recency boost must break the tie in favour of the
higher-significance memory.
"""
db = tmp_path / "t.db"
dim = 4
shared_vec = _one_hot(dim, 0)
_seed_memories_with_optional_embeddings(
db,
memory_specs=[
{
"pov_summary": "she made a promise",
"significance": 0,
"vector": list(shared_vec),
},
{
"pov_summary": "she made a promise",
"significance": 3,
"vector": list(shared_vec),
},
],
)
with open_db(db) as conn:
out = search_memories(
conn,
"bot_a",
"host",
"promise",
k=2,
query_vector=list(shared_vec),
)
assert len(out) == 2
# Higher significance breaks the RRF tie.
assert out[0]["significance"] == 3
assert out[1]["significance"] == 0
def test_search_memories_fusion_handles_empty_vector_results(tmp_path):
"""Vector path returning [] (no embeddings indexed) must not break FTS.
No ``embedding_indexed`` events are projected, so ``vector_search``
returns an empty list. The function should still return the FTS hits
as if ``query_vector`` had not been supplied.
"""
db = tmp_path / "t.db"
_seed(
db,
memory_specs=[
{"pov_summary": "the vault held an old promise"},
{"pov_summary": "another promise was kept that night"},
],
)
with open_db(db) as conn:
out = search_memories(
conn,
"bot_a",
"host",
"promise",
k=4,
query_vector=[0.0] * 384, # No embeddings exist for this owner.
)
# Both FTS hits still come back — no error from the empty vector path.
assert len(out) == 2
summaries = {r["pov_summary"] for r in out}
assert summaries == {
"the vault held an old promise",
"another promise was kept that night",
}
+57 -3
View File
@@ -22,7 +22,7 @@ from chat.db.migrate import apply_migrations
from chat.eventlog.log import append_event
from chat.eventlog.projector import project
from chat.llm.mock import MockLLMClient
from chat.services.memory_write import record_turn_memory, record_turn_memory_for_present
from chat.services.memory_write import record_turn_memory_for_present
import chat.state.entities # noqa: F401 - register handlers
import chat.state.memory # noqa: F401
import chat.state.world # noqa: F401
@@ -64,14 +64,19 @@ def test_record_turn_memory_writes_event_and_projects(tmp_path):
apply_migrations(db)
_seed_minimal(db)
with open_db(db) as conn:
eid, mid = record_turn_memory(
# T90.3: legacy ``record_turn_memory`` was removed; the unified
# ``record_turn_memory_for_present`` with ``guest_bot_id=None``
# produces the same single-bot witness mask [1,1,0].
result = record_turn_memory_for_present(
conn,
chat_id="chat_bot_a",
host_bot_id="bot_a",
guest_bot_id=None,
narrative_text="BotA looks up. 'You're back late.'",
scene_id=None,
chat_clock_at="2026-04-26T20:00:00+00:00",
)
eid, mid = result["bot_a"]
assert eid > 0
assert mid is not None and mid > 0
@@ -111,12 +116,15 @@ def test_record_turn_memory_omits_optional_fields(tmp_path):
_seed_minimal(db)
with open_db(db) as conn:
# Call without scene_id/chat_clock_at — should default to None.
eid, mid = record_turn_memory(
# T90.3: migrated from legacy ``record_turn_memory``.
result = record_turn_memory_for_present(
conn,
chat_id="chat_bot_a",
host_bot_id="bot_a",
guest_bot_id=None,
narrative_text="A simple memory.",
)
eid, mid = result["bot_a"]
assert eid > 0
assert mid is not None and mid > 0
@@ -532,3 +540,49 @@ def test_record_turn_memory_you_present_false_requires_guest(tmp_path):
narrative_text="invalid",
you_present=False,
)
# ---------------------------------------------------------------------------
# T97: embedding-worker enqueue hook.
# ---------------------------------------------------------------------------
def test_record_turn_memory_enqueues_embedding_job(tmp_path):
"""When ``app.state.embedding_worker`` is wired, every per-witness
write enqueues an :class:`EmbeddingJob` carrying the freshly-projected
memory id and the narrative text. Two-bot turn -> two jobs."""
from types import SimpleNamespace
from chat.services.embedding_worker import EmbeddingJob
db = tmp_path / "t.db"
apply_migrations(db)
_seed_two_bots(db)
captured: list[EmbeddingJob] = []
class _StubWorker:
def enqueue(self, job: EmbeddingJob) -> None:
captured.append(job)
fake_app = SimpleNamespace(
state=SimpleNamespace(embedding_worker=_StubWorker())
)
with open_db(db) as conn:
result = record_turn_memory_for_present(
conn,
chat_id="chat_ab",
host_bot_id="bot_a",
guest_bot_id="bot_b",
narrative_text="Both bots witness this beat.",
app=fake_app,
)
# One job per witness — host first, then guest (matches result dict
# insertion order in record_turn_memory_for_present).
assert len(captured) == 2
expected_ids = {result["bot_a"][1], result["bot_b"][1]}
assert {job.memory_id for job in captured} == expected_ids
for job in captured:
assert job.text == "Both bots witness this beat."
+891
View File
@@ -0,0 +1,891 @@
"""Phase 4 cross-feature integration tests (T97 follow-up + T101).
Cross-feature flows for the Phase 4 retrieval + branching + drawer
features. Each test drives multiple Phase 4 surfaces end-to-end and
asserts both event_log and projected-state outcomes.
Test inventory:
* ``test_post_turn_embeddings_indexed_via_worker_hook`` (T97.5)
pins the production turn route's ``app=request.app`` plumbing so
the embedding worker actually receives jobs.
T101 additions (the "Phase 4 cross-feature integration" suite):
1. ``test_vector_retrieval_feedback_loop`` write a memory, drain
the embedding worker, assert the vector path retrieves it.
2. ``test_branch_diverge_main_intact`` create a branch from a
mid-log turn, switch, append more events, switch back and assert
the original log past the branch point is still present (Phase 4
branching is metadata-only no read-side filter yet).
3. ``test_surgical_delete_truncates_log_and_writes_snapshot``
compute impact, confirm via the drawer route, assert the log was
truncated and a pre-rewind snapshot landed on disk.
4. ``test_hide_then_unhide_round_trip_through_read_recent_dialogue``
flip ``hidden`` via the drawer route both directions and assert
``read_recent_dialogue`` honours the flag in real time.
5. ``test_cross_chat_search_surfaces_memories_in_three_chats``
write memories in 3 chats, hit ``/search?q=...`` and assert all
three appear.
The T97.5 test monkeypatches ``app.state.embedding_worker.enqueue`` to
record jobs (rather than draining the worker) because the bug it pins
is "did the call site pass ``app`` at all". T101 test 1 takes the
opposite tack: it drives the worker for real to verify the entire
write -> index -> retrieve loop.
"""
from __future__ import annotations
import json
from pathlib import Path
import pytest
from fastapi.testclient import TestClient
from chat.app import app
from chat.db.connection import open_db
from chat.eventlog.log import append_and_apply, append_event
from chat.eventlog.projector import project
from chat.llm.mock import MockLLMClient
def _zero_state() -> str:
return json.dumps(
{"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
)
def _override_llm(canned: list[str]) -> MockLLMClient:
from chat.web.kickoff import get_llm_client
mock = MockLLMClient(canned=list(canned))
app.dependency_overrides[get_llm_client] = lambda: mock
return mock
@pytest.fixture
def app_state_setup(tmp_path, monkeypatch):
cfg = tmp_path / "config.toml"
cfg.write_text('featherless_api_key = "test"\n')
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
db = tmp_path / "test.db"
monkeypatch.setenv("CHAT_DB_PATH", str(db))
with TestClient(app) as c:
# The background worker is disabled so the canned-response queue
# is consumed only by the request path. The embedding worker
# stays "started" but its loop won't observe the captured
# enqueues — we replace ``enqueue`` on the worker instance below.
app.state.background_worker.enabled = False
yield c
app.dependency_overrides.clear()
def _seed(db_path: Path) -> None:
"""Mirror of ``tests/test_turn_flow.py::_seed`` — single bot + chat
+ edge + activities so the prompt assembler has something to render.
"""
with open_db(db_path) as conn:
append_event(
conn,
kind="bot_authored",
payload={
"id": "bot_a",
"name": "BotA",
"persona": "thoughtful, observant",
"voice_samples": [],
"traits": [],
"backstory": "",
"initial_relationship_to_you": "",
"kickoff_prose": "...",
},
)
append_event(
conn,
kind="chat_created",
payload={
"id": "chat_bot_a",
"host_bot_id": "bot_a",
"initial_time": "2026-04-26T20:00:00+00:00",
"narrative_anchor": "Day 1",
"weather": "",
},
)
append_event(
conn,
kind="edge_update",
payload={
"source_id": "bot_a",
"target_id": "you",
"chat_id": "chat_bot_a",
"knowledge_facts": ["coworker"],
},
)
for entity_id, verb in [("you", "talking"), ("bot_a", "listening")]:
append_event(
conn,
kind="activity_change",
payload={
"entity_id": entity_id,
"posture": "sitting",
"action": {
"verb": verb,
"interruptible": True,
"required_attention": "low",
"expected_duration": "ongoing",
},
"attention": "",
"holding": [],
"status": {},
},
)
project(conn)
def test_post_turn_embeddings_indexed_via_worker_hook(
app_state_setup, tmp_path
):
"""POST a turn; the route must pass ``app=request.app`` into
``record_turn_memory_for_present`` so the per-witness write enqueues
an :class:`EmbeddingJob` on ``app.state.embedding_worker``.
Without the T97.5 wiring this test fails: the call site previously
omitted ``app=`` and the helper's ``app is None`` branch silently
skipped every enqueue. We monkeypatch ``enqueue`` on the live
embedding worker (rather than draining the queue mid-request) so the
assertion does not depend on asyncio scheduling inside the
TestClient the bug is in the wiring, and the wiring is what we
pin. The drain path is covered separately in
:mod:`tests.test_embedding_worker`.
"""
_seed(tmp_path / "test.db")
canned_parse = json.dumps(
{"segments": [{"kind": "dialogue", "text": "hello"}]}
)
_override_llm(
[canned_parse, "Hi there.", _zero_state(), _zero_state()]
)
captured: list = []
worker = app.state.embedding_worker
original_enqueue = worker.enqueue
worker.enqueue = captured.append # type: ignore[assignment]
try:
response = app_state_setup.post(
"/chats/chat_bot_a/turns", data={"prose": "hello"}
)
assert response.status_code == 204
finally:
worker.enqueue = original_enqueue # type: ignore[assignment]
app.dependency_overrides.clear()
# Single-bot turn -> one ``memory_written`` -> one EmbeddingJob.
# The job's ``memory_id`` should match the freshly-projected memory
# row, and its ``text`` should carry the assistant's narrative text.
assert len(captured) == 1
job = captured[0]
assert job.text == "Hi there."
with open_db(tmp_path / "test.db") as conn:
memory_ids = [
r[0]
for r in conn.execute(
"SELECT id FROM memories WHERE owner_id = ?",
("bot_a",),
).fetchall()
]
assert job.memory_id in memory_ids
# ---------------------------------------------------------------------------
# T101 — Phase 4 cross-feature integration suite.
# ---------------------------------------------------------------------------
#
# Helpers + the five required scenarios. Each test drives multiple Phase 4
# features so a regression in any one of them fails an integration check.
def _seed_minimal_chat(db_path: Path, chat_id: str = "chat_bot_a") -> None:
"""Seed bot_a, you, a chat, edges, and activities — same shape as
``tests/test_phase3_integration.py::_seed_single_bot_chat`` but
parameterised on chat_id so the cross-chat search test can stamp
several chats in the same database without renaming bots.
Uses ``append_and_apply`` rather than ``append_event`` + a final
``project`` so successive calls (e.g. one per chat in the
cross-chat-search test) don't try to re-project the cumulative
log and trip the ``chats.id`` UNIQUE constraint on the prior
chat's row.
"""
with open_db(db_path) as conn:
existing_bot = conn.execute(
"SELECT 1 FROM bots WHERE id = 'bot_a'"
).fetchone()
if existing_bot is None:
append_and_apply(
conn,
kind="bot_authored",
payload={
"id": "bot_a",
"name": "BotA",
"persona": "thoughtful",
"voice_samples": [],
"traits": [],
"backstory": "",
"initial_relationship_to_you": "",
"kickoff_prose": "...",
},
)
append_and_apply(
conn,
kind="you_authored",
payload={
"name": "Me",
"pronouns": "they/them",
"persona": "",
},
)
append_and_apply(
conn,
kind="chat_created",
payload={
"id": chat_id,
"host_bot_id": "bot_a",
"initial_time": "2026-04-26T20:00:00+00:00",
"narrative_anchor": "Day 1",
"weather": "",
},
)
append_and_apply(
conn,
kind="edge_update",
payload={
"source_id": "bot_a",
"target_id": "you",
"chat_id": chat_id,
"knowledge_facts": [],
},
)
# Activities are unique per (entity_id) — only seed them on the
# first call (when the bot row is also fresh).
if existing_bot is None:
for entity_id, verb in [
("you", "talking"),
("bot_a", "listening"),
]:
append_and_apply(
conn,
kind="activity_change",
payload={
"entity_id": entity_id,
"posture": "sitting",
"action": {
"verb": verb,
"interruptible": True,
"required_attention": "low",
"expected_duration": "ongoing",
},
"attention": "",
"holding": [],
"status": {},
},
)
# ---------------------------------------------------------------------------
# 1. Vector retrieval feedback loop.
# ---------------------------------------------------------------------------
async def test_vector_retrieval_feedback_loop(tmp_path):
"""End-to-end: write a memory through
:func:`record_turn_memory_for_present` so an :class:`EmbeddingJob`
lands on a worker, drain the worker, then call
:func:`vector_search` with the SAME pseudo-embedding function and
assert the just-written memory is the top hit.
Why this test does NOT use the TestClient fixture: the live
``app.state.embedding_worker`` is created inside the FastAPI
lifespan's event loop. ``await``-ing on it from pytest-asyncio's
loop trips ``"got Future attached to a different loop"``. We
instead spin up a fresh :class:`EmbeddingWorker` in the test
loop, exactly mirroring ``tests/test_embedding_worker.py``'s
pattern. The T97.5 test above pins the wiring between the live
HTTP route and the live app worker; this test pins the
write -> index -> retrieve loop with no transport in scope.
Cross-feature gaps this test catches:
* Memory write enqueues to the worker but the worker never
drains (e.g. ``_run`` deadlock or sentinel mishandled).
* Worker uses a different embedding function than
``vector_search`` at query time, producing different vectors
and breaking cosine retrieval.
* ``embeddings`` projector handler is not registered (e.g.
import ordering bug) so the event fires but the table stays
empty.
"""
from types import SimpleNamespace
from chat.db.migrate import apply_migrations
from chat.services.embedding_worker import EmbeddingWorker
from chat.services.embeddings import generate_embedding
from chat.services.memory_write import record_turn_memory_for_present
from chat.services.vector_search import vector_search
# Trigger projector handler registration. ``record_turn_memory_for_present``
# imports memory_write which imports the worker module, but the
# projector handlers live in ``chat.state.*`` modules and are
# registered as a side effect of import.
import chat.state.embeddings # noqa: F401
import chat.state.entities # noqa: F401
import chat.state.memory # noqa: F401
import chat.state.world # noqa: F401
db = tmp_path / "test.db"
apply_migrations(db)
_seed_minimal_chat(db)
# Spin up our own worker in the test event loop. ``client=None``
# is fine for the pseudo-embedding path — the local hash function
# does not require an LLM client.
worker = EmbeddingWorker(
conn_factory=lambda: open_db(db),
client=None,
)
await worker.start()
# Stub ``app`` — only ``app.state.embedding_worker`` is read by
# ``_write_one_memory``. SimpleNamespace gives us a stand-in that
# exposes ``state.embedding_worker`` without the full FastAPI app.
fake_app = SimpleNamespace(state=SimpleNamespace(embedding_worker=worker))
distinctive_text = "Maya watched the gondola lights drift across the lagoon."
with open_db(db) as conn:
record_turn_memory_for_present(
conn,
chat_id="chat_bot_a",
host_bot_id="bot_a",
guest_bot_id=None,
narrative_text=distinctive_text,
app=fake_app,
)
# Drain the worker via the sentinel. After this returns the
# ``embedding_indexed`` event has been projected.
await worker.stop()
# Generate a query embedding using the same function the worker
# used. The pseudo-embedding is deterministic so a query equal to
# the indexed text produces the identical vector and a cosine
# similarity of 1.0.
query_result = await generate_embedding(client=None, text=distinctive_text)
with open_db(db) as conn:
emb_count = conn.execute(
"SELECT COUNT(*) FROM embeddings"
).fetchone()[0]
assert emb_count == 1, (
"embedding worker did not project an embedding_indexed event"
)
hits = vector_search(
conn,
owner_id="bot_a",
witness_role="host", # bot_a is host, witness_host=1 by default
query_vector=query_result.vector,
k=4,
)
assert len(hits) == 1
top = hits[0]
assert top["pov_summary"] == distinctive_text
# Self-match: cosine of identical vectors is 1.0.
assert top["score"] == pytest.approx(1.0, abs=1e-9)
# ---------------------------------------------------------------------------
# 2. Branch + diverge: main's post-branch tail stays intact (Phase 4
# branches are metadata-only).
# ---------------------------------------------------------------------------
def test_branch_diverge_main_intact(app_state_setup, tmp_path):
"""Append turns 1-12 on main, branch from turn 10's event_id, switch
to the new branch, append 3 more "play" turns, switch back to main,
assert the original turn 11+ events are untouched.
Phase 4's branches table is metadata-only — the read-side filter
isn't wired yet, so all events live in one log regardless of which
branch is "active". This test pins that contract: switching does
not mutate or hide existing events on either branch.
Canned LLM queue: none. ``user_turn`` / ``assistant_turn`` are
transcript-only kinds with no projector handler that needs an
LLM call, and ``branch_created`` / ``branch_switched`` are pure
state events. We use ``append_and_apply`` directly rather than
driving the HTTP turn route, which would require a 6-slot canned
queue per turn (parse + narrative + 2 state-updates + scene-close
+ memory) for 15 turns total = 90 slots of plumbing irrelevant to
the branch contract.
"""
from chat.services.branching import branch_from_event, switch_active_branch
from chat.state.branches import active_branch
db = tmp_path / "test.db"
_seed_minimal_chat(db)
# Append 12 user_turn / assistant_turn pairs on main. We collect
# the assistant_turn id at index 10 (1-based: "turn 10") so the
# branch fork point is unambiguous.
main_turn_ids: list[int] = []
with open_db(db) as conn:
for i in range(1, 13):
user_id = append_and_apply(
conn,
kind="user_turn",
payload={
"chat_id": "chat_bot_a",
"prose": f"main turn {i}",
"segments": [],
},
)
asst_id = append_and_apply(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": f"main reply {i}",
"truncated": False,
"user_turn_id": user_id,
},
)
main_turn_ids.append(asst_id)
turn_10_id = main_turn_ids[9]
# Snapshot the post-turn-10 main tail (turns 11, 12 + their
# user_turn predecessors) so we can byte-compare after the
# round-trip.
main_tail_before = conn.execute(
"SELECT id, kind, payload_json, hidden, superseded_by "
"FROM event_log WHERE id > ? ORDER BY id",
(turn_10_id,),
).fetchall()
assert len(main_tail_before) == 4 # 2 user + 2 assistant past turn 10
# Branch from turn 10. Phase 4's helper validates the origin
# event id exists and emits ``branch_created``.
branch_from_event(
conn,
name="experiment",
origin_event_id=turn_10_id,
chat_id="chat_bot_a",
)
switch_active_branch(conn, name="experiment")
active = active_branch(conn)
assert active is not None and active["name"] == "experiment"
# Play 3 turns on the experiment branch.
for i in range(1, 4):
user_id = append_and_apply(
conn,
kind="user_turn",
payload={
"chat_id": "chat_bot_a",
"prose": f"experiment turn {i}",
"segments": [],
},
)
append_and_apply(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": f"experiment reply {i}",
"truncated": False,
"user_turn_id": user_id,
},
)
# Switch back to main.
switch_active_branch(conn, name="main")
active2 = active_branch(conn)
assert active2 is not None and active2["name"] == "main"
# Main's original tail past turn 10 is byte-identical: the
# branching events (branch_created, branch_switched x2) and the
# 3 experiment turns sit AFTER the original tail in event_log
# order, never overwriting it.
main_tail_after = conn.execute(
"SELECT id, kind, payload_json, hidden, superseded_by "
"FROM event_log "
"WHERE id > ? AND id <= ? ORDER BY id",
(turn_10_id, main_turn_ids[-1]),
).fetchall()
assert main_tail_after == main_tail_before
# The 6 experiment events (3 user + 3 assistant) all live in
# the same log past the original main tail. Verify their
# prose payloads to disambiguate from main's content.
diverged = conn.execute(
"SELECT kind, json_extract(payload_json, '$.prose'), "
" json_extract(payload_json, '$.text') "
"FROM event_log WHERE id > ? "
" AND kind IN ('user_turn', 'assistant_turn') ORDER BY id",
(main_turn_ids[-1],),
).fetchall()
assert len(diverged) == 6
prose_or_text = [(row[1] or row[2]) for row in diverged]
# Sequence: user1, asst1, user2, asst2, user3, asst3.
assert "experiment turn 1" in prose_or_text
assert "experiment reply 1" in prose_or_text
assert "experiment turn 3" in prose_or_text
assert "experiment reply 3" in prose_or_text
# ---------------------------------------------------------------------------
# 3. Surgical delete: impact preview -> confirm -> log truncated +
# pre-rewind snapshot saved.
# ---------------------------------------------------------------------------
def test_surgical_delete_truncates_log_and_writes_snapshot(
app_state_setup, tmp_path
):
"""Compute the delete-impact for a turn (read-only preview), then
confirm via the POST drawer route. Assert:
* The preview returns 200 + cascade markup.
* The event_log is physically truncated past ``target_id - 1``.
* A snapshot file lands under ``<data_dir>/snapshots/rewind/``.
* The pre-rewind snapshot's ``last_event_id`` matches the high
water mark BEFORE the truncate (so recovery can replay back to
pre-delete state).
Snapshot location: T97.5's ``data_dir`` derives from the db's
parent directory when ``CHAT_DATA_DIR`` is unset. The fixture
sets ``CHAT_DB_PATH = tmp_path / "test.db"`` so the snapshot
parent is ``tmp_path / "snapshots" / "rewind"``.
No canned LLM queue the preview is pure SQL and the rewind path
is also pure SQL (delete + reproject). The drawer routes don't
invoke the LLM.
"""
import json as _json
db = tmp_path / "test.db"
_seed_minimal_chat(db)
# Append a small fixed turn sequence we can predict the cascade for.
with open_db(db) as conn:
first_user = append_and_apply(
conn,
kind="user_turn",
payload={
"chat_id": "chat_bot_a",
"prose": "first message",
"segments": [],
},
)
append_and_apply(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": "first reply",
"truncated": False,
"user_turn_id": first_user,
},
)
target_user = append_and_apply(
conn,
kind="user_turn",
payload={
"chat_id": "chat_bot_a",
"prose": "this turn will be deleted",
"segments": [],
},
)
target_asst = append_and_apply(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": "and so will this reply",
"truncated": False,
"user_turn_id": target_user,
},
)
# One trailing event past the target so we can verify the
# cascade catches >1 event.
trailing = append_and_apply(
conn,
kind="user_turn",
payload={
"chat_id": "chat_bot_a",
"prose": "trailing context",
"segments": [],
},
)
max_id_before = conn.execute(
"SELECT MAX(id) FROM event_log"
).fetchone()[0]
# ---- Preview: GET delete-preview returns 200 + the cascade list. ----
preview = app_state_setup.get(
f"/chats/chat_bot_a/drawer/turn/delete-preview/{target_user}"
)
assert preview.status_code == 200
body = preview.text
assert "delete-impact-modal" in body
assert f"Delete event {target_user}?" in body
assert "user_turn" in body
assert "assistant_turn" in body
# Confirm form points at the delete route.
assert f"/drawer/turn/delete/{target_user}" in body
# ---- Confirm: POST delete drops user, assistant, AND trailing. ----
confirm = app_state_setup.post(
f"/chats/chat_bot_a/drawer/turn/delete/{target_user}"
)
assert confirm.status_code == 200
# ---- Event log truncated past target_user - 1. ----
with open_db(db) as conn:
max_id_after = conn.execute(
"SELECT MAX(id) FROM event_log"
).fetchone()[0]
# delete_turn passes ``after_event_id = target_user - 1`` so
# everything from target_user forward is gone.
assert max_id_after == target_user - 1
for ev_id in (target_user, target_asst, trailing):
row = conn.execute(
"SELECT 1 FROM event_log WHERE id = ?", (ev_id,)
).fetchone()
assert row is None, f"event {ev_id} should have been deleted"
# ---- Pre-rewind snapshot landed on disk. ----
snapshot_dir = tmp_path / "snapshots" / "rewind"
assert snapshot_dir.exists(), (
f"snapshot dir not created: {snapshot_dir}"
)
snapshots = sorted(snapshot_dir.glob("*.json"))
assert len(snapshots) >= 1, (
f"no rewind snapshot written under {snapshot_dir}"
)
# Most-recent snapshot's last_event_id == pre-truncate high water
# mark, so a "restore" path could fully reverse the delete.
latest_snapshot = snapshots[-1]
snap_data = _json.loads(latest_snapshot.read_text())
assert snap_data["last_event_id"] == max_id_before
# ---------------------------------------------------------------------------
# 4. Hide + retrieval: drawer hide drops a turn from read_recent_dialogue,
# unhide restores it.
# ---------------------------------------------------------------------------
def test_hide_then_unhide_round_trip_through_read_recent_dialogue(
app_state_setup, tmp_path
):
"""Drive a hide -> read -> unhide -> read cycle through the drawer
HTTP route and assert ``read_recent_dialogue`` flips visibility
each step. T98.3 wires the route; T55 / turn_common owns the
``hidden = 0`` filter.
Cross-feature: the drawer HTTP handler emits a ``manual_edit``
event with branch ``turn_hidden``, the manual_edit projector
flips ``event_log.hidden``, and the prompt-window reader filters
on that column. Three layers any one breaking would fail this
test.
No canned LLM queue hide/unhide are pure SQL routes.
"""
from chat.services.turn_common import read_recent_dialogue
db = tmp_path / "test.db"
_seed_minimal_chat(db)
with open_db(db) as conn:
user_a = append_and_apply(
conn,
kind="user_turn",
payload={
"chat_id": "chat_bot_a",
"prose": "first user line",
"segments": [],
},
)
asst_a = append_and_apply(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": "first reply",
"truncated": False,
"user_turn_id": user_a,
},
)
user_b = append_and_apply(
conn,
kind="user_turn",
payload={
"chat_id": "chat_bot_a",
"prose": "second user line",
"segments": [],
},
)
asst_b = append_and_apply(
conn,
kind="assistant_turn",
payload={
"chat_id": "chat_bot_a",
"speaker_id": "bot_a",
"text": "second reply",
"truncated": False,
"user_turn_id": user_b,
},
)
# Baseline: all 4 turns visible.
baseline = read_recent_dialogue(conn, "chat_bot_a", limit=10)
baseline_ids = {t["event_id"] for t in baseline}
assert {user_a, asst_a, user_b, asst_b} <= baseline_ids
# ---- Hide user_b via the drawer route. ----
hide_resp = app_state_setup.post(
f"/chats/chat_bot_a/drawer/turn/hide/{user_b}",
data={"hidden": "1"},
)
assert hide_resp.status_code == 200
with open_db(db) as conn:
# event_log.hidden flipped.
row = conn.execute(
"SELECT hidden FROM event_log WHERE id = ?", (user_b,)
).fetchone()
assert int(row[0]) == 1
# read_recent_dialogue drops user_b but keeps the others.
after_hide = read_recent_dialogue(conn, "chat_bot_a", limit=10)
after_hide_ids = {t["event_id"] for t in after_hide}
assert user_b not in after_hide_ids
# The other 3 turns still surface.
assert {user_a, asst_a, asst_b} <= after_hide_ids
# ---- Unhide via the SAME route with hidden=0. ----
unhide_resp = app_state_setup.post(
f"/chats/chat_bot_a/drawer/turn/hide/{user_b}",
data={"hidden": "0"},
)
assert unhide_resp.status_code == 200
with open_db(db) as conn:
row = conn.execute(
"SELECT hidden FROM event_log WHERE id = ?", (user_b,)
).fetchone()
assert int(row[0]) == 0
# read_recent_dialogue restores user_b.
after_unhide = read_recent_dialogue(conn, "chat_bot_a", limit=10)
after_unhide_ids = {t["event_id"] for t in after_unhide}
assert {user_a, asst_a, user_b, asst_b} <= after_unhide_ids
# Two manual_edit events landed (one per toggle), each with the
# turn_hidden branch tag.
edits = conn.execute(
"SELECT payload_json FROM event_log "
"WHERE kind = 'manual_edit' "
" AND json_extract(payload_json, '$.target_kind') = 'turn_hidden' "
"ORDER BY id"
).fetchall()
assert len(edits) == 2
# ---------------------------------------------------------------------------
# 5. Cross-chat search: memories across 3 chats all surface from /search.
# ---------------------------------------------------------------------------
def test_cross_chat_search_surfaces_memories_in_three_chats(
app_state_setup, tmp_path
):
"""Seed 3 chats each owned by bot_a (so the bot row exists for the
search route's display-name hydration), write a distinctive
memory in each, then GET ``/search?q=<distinctive>`` and assert
every chat appears as a result row.
Cross-feature: T93's :func:`search_all_memories` (no per-owner
filter) + T100's HTML route (display-name hydration via
``get_bot``/``get_chat``). The route's empty-query short-circuit
is incidentally exercised by the request setup but isn't the
focus.
No canned LLM queue memory_written events are projected directly
via ``append_and_apply`` and the search route is pure SQL +
template rendering.
"""
db = tmp_path / "test.db"
# Three chats, all hosted by bot_a so bot_a is the owner of all
# three memories. _seed_minimal_chat skips the bot/you bootstrap
# after the first call so the cumulative seed is consistent.
chat_ids = ["chat_bot_a", "chat_bot_a_2", "chat_bot_a_3"]
for chat_id in chat_ids:
_seed_minimal_chat(db, chat_id=chat_id)
# Distinctive token — "wisteria" appears nowhere else in the seed.
distinctive = "wisteria"
with open_db(db) as conn:
for idx, chat_id in enumerate(chat_ids):
append_and_apply(
conn,
kind="memory_written",
payload={
"owner_id": "bot_a",
"chat_id": chat_id,
"pov_summary": (
f"the {distinctive} bloomed by the gate (chat {idx})"
),
"witness_you": 1,
"witness_host": 1,
"witness_guest": 0,
"source": "direct",
"reliability": 1.0,
"significance": 1,
"pinned": 0,
"auto_pinned": 0,
},
)
# ---- GET /search?q=wisteria -> all 3 chats appear as result rows. ----
response = app_state_setup.get(f"/search?q={distinctive}")
assert response.status_code == 200
body = response.text
# Each chat_id appears in a result link href, e.g.
# ``href="/chats/chat_bot_a"``. The template renders one
# ``<a class="search-result-link" href="/chats/{chat_id}">`` per
# row, so a substring match per chat is sufficient.
for chat_id in chat_ids:
assert f'href="/chats/{chat_id}"' in body, (
f"chat {chat_id} missing from /search results: {body!r}"
)
# The owner display name (BotA) renders for each row — verify >= 3
# occurrences so we know all 3 result rows hydrated, not just 1.
assert body.count("BotA") >= 3
# ---- Sanity: distractor query yields no results. ----
distractor_response = app_state_setup.get(
"/search?q=nonexistentterm12345"
)
assert distractor_response.status_code == 200
distractor_body = distractor_response.text
# The "no matches" empty-state copy fires.
assert "No matches" in distractor_body
for chat_id in chat_ids:
assert f'href="/chats/{chat_id}"' not in distractor_body
+7
View File
@@ -757,6 +757,13 @@ def test_regenerate_with_prior_lifecycle_logs_warning(tmp_path, monkeypatch, cap
# row's id.
assert str(at_id) in msg
assert str(completed_id) in msg
# T90.2: wording was tightened from "from superseded turn" to
# "at-or-after turn <id>" — when regenerating an OLDER turn, the
# listed transitions may include legitimate intervening-turn ones
# that stand on their own. The new phrasing avoids implying the
# warning's target turn directly authored every listed transition.
assert "at-or-after turn" in msg
assert "from superseded turn" not in msg
def test_regenerate_sibling_lookup_scoped_to_chat(tmp_path, monkeypatch):
+135
View File
@@ -0,0 +1,135 @@
"""T100 (Phase 4): cross-chat search UX (top-bar + results page).
Verifies the FastAPI ``/search`` route that wraps T93's
``search_all_memories`` service:
* ``/search?q=...`` returns 200 + an HTML page that lists matches drawn
from MULTIPLE chats (not just the current one) and links each result
back to ``/chats/{chat_id}``.
* ``/search`` with no query renders the page in its empty state with a
"enter a query" placeholder and no result rows (avoids hitting the
FTS index with an invalid empty MATCH).
* Result links navigate to the originating chat so users can pick up
the thread where the memory came from.
"""
from __future__ import annotations
from pathlib import Path
import pytest
from fastapi.testclient import TestClient
from chat.app import app
from chat.db.connection import open_db
from chat.eventlog.log import append_event
from chat.eventlog.projector import project
import chat.state.memory # noqa: F401 (registers memory_written handler)
@pytest.fixture
def client(tmp_path, monkeypatch):
config_path = tmp_path / "config.toml"
config_path.write_text('featherless_api_key = "test"\n')
monkeypatch.setenv("CHAT_CONFIG_PATH", str(config_path))
monkeypatch.setenv("CHAT_DB_PATH", str(tmp_path / "test.db"))
with TestClient(app) as c:
yield c
def _seed_two_chats_with_memories(db_path: Path) -> None:
"""Seed: a ``you_entity``, two bots, two chats, and one ``rabbit``
memory per chat. Two-chat seeding lets the cross-chat assertion
actually distinguish "both chats appear" from "only the current
one does"."""
with open_db(db_path) as conn:
append_event(
conn,
kind="you_authored",
payload={"name": "Me", "pronouns": "", "persona": ""},
)
for bot_id, chat_id in (("bot_a", "chat_a"), ("bot_b", "chat_b")):
append_event(
conn,
kind="bot_authored",
payload={
"id": bot_id,
"name": bot_id.upper(),
"persona": "thoughtful",
"voice_samples": [],
"traits": [],
"backstory": "",
"initial_relationship_to_you": "friend",
"kickoff_prose": "kickoff",
},
)
append_event(
conn,
kind="chat_created",
payload={
"id": chat_id,
"host_bot_id": bot_id,
"initial_time": "2026-04-26T20:00:00+00:00",
"narrative_anchor": "Day 1",
"weather": "",
},
)
append_event(
conn,
kind="memory_written",
payload={
"owner_id": bot_id,
"chat_id": chat_id,
"pov_summary": f"the rabbit darted across {chat_id}",
"witness_you": 1,
"witness_host": 1,
"witness_guest": 0,
"source": "direct",
"reliability": 1.0,
"significance": 1,
"pinned": 0,
"auto_pinned": 0,
},
)
project(conn)
def test_search_returns_results_from_multiple_chats(client, tmp_path):
"""A single ``/search?q=rabbit`` must surface matches from BOTH
chats the whole point of the cross-chat search box is that it
isn't owner-scoped."""
_seed_two_chats_with_memories(tmp_path / "test.db")
resp = client.get("/search?q=rabbit")
assert resp.status_code == 200
body = resp.text
# Both chats' memory snippets must appear in the rendered page.
assert "chat_a" in body
assert "chat_b" in body
assert "rabbit" in body.lower()
def test_empty_query_renders_placeholder_not_results(client, tmp_path):
"""``/search`` with no query renders the page in its empty state.
The placeholder copy is a contract with the user they should see
"enter a query" rather than an empty result list that looks like a
no-match. Also: the FTS short-circuit means there are no result
rows to leak into the body."""
_seed_two_chats_with_memories(tmp_path / "test.db")
resp = client.get("/search")
assert resp.status_code == 200
body = resp.text.lower()
assert "enter a query" in body
# Seeded "rabbit" memories must NOT appear: empty query => no results.
assert "the rabbit darted" not in resp.text
def test_result_links_navigate_to_chat(client, tmp_path):
"""Each result links back to its originating chat so the user can
reopen the thread where the memory was first witnessed."""
_seed_two_chats_with_memories(tmp_path / "test.db")
resp = client.get("/search?q=rabbit")
assert resp.status_code == 200
# The link target is chat-level (memories don't carry an event_id
# column today, so we don't deep-link to a specific turn).
assert 'href="/chats/chat_a"' in resp.text
+182
View File
@@ -0,0 +1,182 @@
"""Tests for Task 99 — snapshot UX (manual trigger + list + restore + preview).
Phase 4 surfaces the existing snapshot infrastructure (Phase 1 T20 / T31)
through HTML routes so the user can:
* see what snapshots exist,
* take one on demand,
* restore one with a hard confirm,
* peek at metadata before restoring.
The underlying service API lives in ``chat/services/snapshot.py`` and is
already exercised by ``test_snapshot.py``; here we only verify the web
surface wires the existing functions correctly.
"""
from __future__ import annotations
import json
from pathlib import Path
import pytest
from fastapi.testclient import TestClient
from chat.app import app
from chat.db.connection import open_db
from chat.eventlog.log import append_event
from chat.eventlog.projector import project
def _bot_payload(bot_id: str, name: str) -> dict:
return {
"id": bot_id,
"name": name,
"persona": "fancy",
"voice_samples": ["sample"],
"traits": ["shy"],
"backstory": "",
"initial_relationship_to_you": "coworker",
"kickoff_prose": "",
}
@pytest.fixture
def client(tmp_path, monkeypatch):
"""A TestClient whose db + data_dir live under ``tmp_path``.
``load_settings`` derives ``data_dir`` from ``CHAT_DB_PATH``'s parent
when ``CHAT_DATA_DIR`` is unset (see ``chat/config.py``), so this also
isolates the ``data/snapshots/`` tree to ``tmp_path``.
"""
config_path = tmp_path / "config.toml"
config_path.write_text('featherless_api_key = "test"\n')
monkeypatch.setenv("CHAT_CONFIG_PATH", str(config_path))
monkeypatch.setenv("CHAT_DB_PATH", str(tmp_path / "test.db"))
with TestClient(app) as c:
c.tmp_path = tmp_path # type: ignore[attr-defined]
yield c
def _seed_bot(db_path: Path, bot_id: str = "bot_a", name: str = "BotA") -> None:
with open_db(db_path) as conn:
append_event(conn, kind="bot_authored", payload=_bot_payload(bot_id, name))
project(conn)
def _take_snapshot_via_service(
db_path: Path, data_dir: Path, kind: str = "periodic"
) -> Path:
from chat.services.snapshot import take_snapshot
with open_db(db_path) as conn:
return take_snapshot(conn, data_dir=data_dir, kind=kind)
def test_list_snapshots_renders_page(client, tmp_path):
_seed_bot(tmp_path / "test.db", "bot_a", "BotA")
# Take two snapshots through the service so the listing has rows.
p1 = _take_snapshot_via_service(tmp_path / "test.db", tmp_path, kind="periodic")
p2 = _take_snapshot_via_service(tmp_path / "test.db", tmp_path, kind="rewind")
response = client.get("/snapshots")
assert response.status_code == 200
body = response.text
# Both filenames should appear in the listing.
assert p1.stem in body
assert p2.stem in body
# Both kinds should be visible.
assert "periodic" in body
assert "rewind" in body
def test_take_snapshot_creates_new(client, tmp_path):
_seed_bot(tmp_path / "test.db", "bot_a", "BotA")
snapshot_dir = tmp_path / "snapshots" / "periodic"
before = (
len(list(snapshot_dir.glob("*.json"))) if snapshot_dir.exists() else 0
)
response = client.post("/snapshots/take", follow_redirects=False)
assert response.status_code == 303
assert response.headers["location"] == "/snapshots"
after = len(list(snapshot_dir.glob("*.json")))
assert after == before + 1
def test_restore_snapshot_with_correct_confirm(client, tmp_path):
db_path = tmp_path / "test.db"
_seed_bot(db_path, "bot_a", "BotA")
snapshot_path = _take_snapshot_via_service(
db_path, tmp_path, kind="periodic"
)
snapshot_id = snapshot_path.stem # filename without extension
# Mutate the DB after the snapshot was taken — restoring should erase
# the new bot.
with open_db(db_path) as conn:
append_event(
conn, kind="bot_authored", payload=_bot_payload("bot_b", "BotB")
)
project(conn)
bots_before = conn.execute(
"SELECT id FROM bots ORDER BY id"
).fetchall()
assert {r[0] for r in bots_before} == {"bot_a", "bot_b"}
response = client.post(
f"/snapshots/restore/{snapshot_id}",
data={"confirm_id": snapshot_id, "kind": "periodic"},
follow_redirects=False,
)
assert response.status_code == 303
with open_db(db_path) as conn:
bots_after = conn.execute(
"SELECT id FROM bots ORDER BY id"
).fetchall()
# The post-snapshot bot should be gone.
assert {r[0] for r in bots_after} == {"bot_a"}
def test_restore_snapshot_wrong_confirm_400(client, tmp_path):
db_path = tmp_path / "test.db"
_seed_bot(db_path, "bot_a", "BotA")
snapshot_path = _take_snapshot_via_service(
db_path, tmp_path, kind="periodic"
)
snapshot_id = snapshot_path.stem
response = client.post(
f"/snapshots/restore/{snapshot_id}",
data={"confirm_id": "not_the_right_id", "kind": "periodic"},
follow_redirects=False,
)
assert response.status_code == 400
def test_preview_renders_metadata(client, tmp_path):
db_path = tmp_path / "test.db"
_seed_bot(db_path, "bot_a", "BotA")
snapshot_path = _take_snapshot_via_service(
db_path, tmp_path, kind="periodic"
)
snapshot_id = snapshot_path.stem
# Append more events post-snapshot so the delta is non-zero.
with open_db(db_path) as conn:
append_event(
conn, kind="bot_authored", payload=_bot_payload("bot_b", "BotB")
)
project(conn)
response = client.get(
f"/snapshots/{snapshot_id}/preview", params={"kind": "periodic"}
)
assert response.status_code == 200
body = response.text
assert snapshot_id in body
# Snapshot's last_event_id and current event_log size should appear.
dump = json.loads(snapshot_path.read_text())
assert str(dump["last_event_id"]) in body
+76
View File
@@ -186,6 +186,82 @@ def test_read_recent_dialogue_filters_superseded_and_other_chats(tmp_path):
assert ut_id is not None
def test_read_recent_dialogue_limit_respects_chat_scope(tmp_path):
"""T90.1: ``read_recent_dialogue`` must push the chat_id filter into
SQL so that ``LIMIT N`` returns N rows scoped to the requested chat
not N globally-recent rows that may then be filtered down to fewer in
Python.
Setup: two chats with 60 turns each, interleaved. With the old
post-fetch filter, ``LIMIT 50`` would pull 50 globally-recent rows
(most or all from chat_b the most recent inserts) and then drop
chat_b ones via the Python check, yielding far fewer than 50 chat_a
rows. After the SQL pushdown, ``LIMIT 50`` should return exactly 50
chat_a rows.
"""
db = tmp_path / "test.db"
apply_migrations(db)
with open_db(db) as conn:
for chat_id, host_bot in (("chat_a", "bot_a"), ("chat_b", "bot_b")):
append_event(
conn,
kind="bot_authored",
payload={
"id": host_bot,
"name": host_bot,
"persona": "...",
"voice_samples": [],
"traits": [],
"backstory": "",
"initial_relationship_to_you": "",
"kickoff_prose": "",
},
)
append_event(
conn,
kind="chat_created",
payload={
"id": chat_id,
"host_bot_id": host_bot,
"initial_time": "2026-04-26T20:00:00+00:00",
"narrative_anchor": "Day 1",
"weather": "",
},
)
# Interleave 60 user_turn rows in each chat — chat_b's go in last
# so they dominate the global tail.
for i in range(60):
append_event(
conn,
kind="user_turn",
payload={
"chat_id": "chat_a",
"prose": f"a-{i}",
"segments": [],
},
)
for i in range(60):
append_event(
conn,
kind="user_turn",
payload={
"chat_id": "chat_b",
"prose": f"b-{i}",
"segments": [],
},
)
project(conn)
out = read_recent_dialogue(conn, "chat_a", limit=50)
# All returned rows should belong to chat_a (texts a-* only).
assert len(out) == 50
for entry in out:
assert entry["text"].startswith("a-"), (
f"foreign chat row leaked: {entry!r}"
)
def test_gather_prior_edges_fills_missing_with_default(tmp_path):
"""``gather_prior_edges`` returns one entry per directed pair across
``present_ids``. Missing rows fall back to the schema default
+242
View File
@@ -0,0 +1,242 @@
from __future__ import annotations
import pytest
from chat.db.connection import open_db
from chat.db.migrate import apply_migrations
from chat.eventlog.log import append_event
from chat.eventlog.projector import project
import chat.state.memory # registers memory_written handler
import chat.state.embeddings # registers embedding handlers
from chat.services.vector_search import vector_search
def _base_memory(**overrides):
payload = {
"owner_id": "bot_a",
"chat_id": "chat_bot_a",
"scene_id": 1,
"pov_summary": "She laughed at his joke about owls.",
"witness_you": 1,
"witness_host": 1,
"witness_guest": 0,
"chat_clock_at": "2026-04-26T10:00:00",
"source": "direct",
"reliability": 1.0,
"significance": 1,
"pinned": 0,
"auto_pinned": 0,
}
payload.update(overrides)
return payload
def _one_hot(dim: int, idx: int) -> list[float]:
"""Return a one-hot vector of length ``dim`` with 1.0 at ``idx``."""
v = [0.0] * dim
v[idx] = 1.0
return v
def _seed_memory_with_embedding(
conn,
*,
owner_id: str,
pov_summary: str,
vector: list[float],
significance: int = 1,
witness_you: int = 1,
witness_host: int = 1,
witness_guest: int = 0,
model: str = "test-model",
) -> int:
append_event(
conn,
kind="memory_written",
payload=_base_memory(
owner_id=owner_id,
pov_summary=pov_summary,
significance=significance,
witness_you=witness_you,
witness_host=witness_host,
witness_guest=witness_guest,
),
)
project(conn)
memory_id = conn.execute(
"SELECT id FROM memories WHERE pov_summary = ?", (pov_summary,)
).fetchone()[0]
append_event(
conn,
kind="embedding_indexed",
payload={
"memory_id": memory_id,
"vector": vector,
"model": model,
"dim": len(vector),
},
)
project(conn)
return memory_id
def test_vector_search_returns_nearest_neighbors(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
dim = 8
ids = []
for i in range(5):
mid = _seed_memory_with_embedding(
conn,
owner_id="bot_a",
pov_summary=f"Memory {i}.",
vector=_one_hot(dim, i),
)
ids.append(mid)
# Query close to memory index 3 (one-hot at position 3, plus tiny noise).
query = _one_hot(dim, 3)
query[2] = 0.01
results = vector_search(
conn,
owner_id="bot_a",
witness_role="you",
query_vector=query,
k=3,
)
assert len(results) == 3
# Top-1 must be memory at index 3.
assert results[0]["memory_id"] == ids[3]
assert results[0]["pov_summary"] == "Memory 3."
# Score for the near-perfect match should be very close to 1.0.
assert results[0]["score"] > 0.99
# Results sorted by score DESC.
scores = [r["score"] for r in results]
assert scores == sorted(scores, reverse=True)
# Second place should be memory index 2 (the small noise component).
assert results[1]["memory_id"] == ids[2]
def test_vector_search_respects_witness_filter(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
dim = 4
# Memory visible to you=1, host=1, guest=0.
_seed_memory_with_embedding(
conn,
owner_id="bot_a",
pov_summary="Restricted.",
vector=_one_hot(dim, 0),
witness_you=1,
witness_host=1,
witness_guest=0,
)
# Guest sees nothing.
guest_results = vector_search(
conn,
owner_id="bot_a",
witness_role="guest",
query_vector=_one_hot(dim, 0),
k=4,
)
assert guest_results == []
# Host sees the memory.
host_results = vector_search(
conn,
owner_id="bot_a",
witness_role="host",
query_vector=_one_hot(dim, 0),
k=4,
)
assert len(host_results) == 1
assert host_results[0]["pov_summary"] == "Restricted."
# You also see it.
you_results = vector_search(
conn,
owner_id="bot_a",
witness_role="you",
query_vector=_one_hot(dim, 0),
k=4,
)
assert len(you_results) == 1
def test_vector_search_respects_owner_filter(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
dim = 4
_seed_memory_with_embedding(
conn,
owner_id="bot_a",
pov_summary="Owner A memory.",
vector=_one_hot(dim, 0),
)
_seed_memory_with_embedding(
conn,
owner_id="bot_b",
pov_summary="Owner B memory.",
vector=_one_hot(dim, 0),
)
a_results = vector_search(
conn,
owner_id="bot_a",
witness_role="you",
query_vector=_one_hot(dim, 0),
k=10,
)
assert len(a_results) == 1
assert a_results[0]["pov_summary"] == "Owner A memory."
b_results = vector_search(
conn,
owner_id="bot_b",
witness_role="you",
query_vector=_one_hot(dim, 0),
k=10,
)
assert len(b_results) == 1
assert b_results[0]["pov_summary"] == "Owner B memory."
def test_vector_search_invalid_witness_role_raises(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
with pytest.raises(ValueError, match="witness_role"):
vector_search(
conn,
owner_id="bot_a",
witness_role="invalid",
query_vector=[1.0, 0.0, 0.0],
k=4,
)
def test_vector_search_empty_when_no_embeddings_indexed(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
# Seed a memory but don't index an embedding for it.
append_event(
conn,
kind="memory_written",
payload=_base_memory(owner_id="bot_a", pov_summary="No embedding here."),
)
project(conn)
results = vector_search(
conn,
owner_id="bot_a",
witness_role="you",
query_vector=[1.0, 0.0, 0.0, 0.0],
k=4,
)
assert results == []
+2 -2
View File
@@ -324,11 +324,11 @@ def test_get_scene_returns_none_for_missing(tmp_path):
assert active_scene(conn, "chat_missing") is None
def test_schema_version_after_migration_is_11(tmp_path):
def test_schema_version_after_migration_is_13(tmp_path):
db = tmp_path / "t.db"
apply_migrations(db)
with open_db(db) as conn:
row = conn.execute(
"SELECT value FROM meta WHERE key = 'schema_version'"
).fetchone()
assert int(row[0]) == 11
assert int(row[0]) == 13