Compare commits
58 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 2d1900bc8f | |||
| 50ab0c8229 | |||
| 49be3cf4b9 | |||
| a11255a5e6 | |||
| 3a81e540a1 | |||
| 0f8bf94d29 | |||
| 6d57fe88b4 | |||
| f775eb7e92 | |||
| e535a0181e | |||
| f7eec707a9 | |||
| 3b83786b8b | |||
| a902d86432 | |||
| de7f6624f0 | |||
| d656ee8805 | |||
| fe9c497038 | |||
| b3d78c1603 | |||
| a03f664407 | |||
| cdfacdd0c4 | |||
| 5c4356e4e2 | |||
| 969f8963bc | |||
| f71613786b | |||
| 4afaf01de7 | |||
| 5bc9a94734 | |||
| cfc05a140c | |||
| 80ce891bd8 | |||
| 6d4ad86e33 | |||
| 7370f68bdf | |||
| f863cf0158 | |||
| 456f50d334 | |||
| 757abf24f8 | |||
| 9b7a6d459f | |||
| e0a28abbcd | |||
| ac6e74ab4c | |||
| 5f16bb575a | |||
| f05d1e0f21 | |||
| 9987da2c07 | |||
| fa87ab8c55 | |||
| fae6edef6b | |||
| 2ab8fcbdf0 | |||
| 5d5c888acf | |||
| a45a33534f | |||
| f3827706df | |||
| 2afbb9fefc | |||
| 1f8b4d2078 | |||
| 3f1a284acb | |||
| 87f93f00b5 | |||
| d1e2902655 | |||
| 54dfa8d611 | |||
| 5d36d3456f | |||
| 0e9421dcf7 | |||
| baffeb3a44 | |||
| 29b7c90b29 | |||
| 64c9ca634a | |||
| 374a76c867 | |||
| b65e1e1098 | |||
| 996a16cfb5 | |||
| a06f90a164 | |||
| df977fc985 |
@@ -5,6 +5,7 @@ data/
|
|||||||
|
|
||||||
# Python
|
# Python
|
||||||
.venv/
|
.venv/
|
||||||
|
.mlx-venv/
|
||||||
__pycache__/
|
__pycache__/
|
||||||
*.pyc
|
*.pyc
|
||||||
.pytest_cache/
|
.pytest_cache/
|
||||||
|
|||||||
@@ -322,53 +322,48 @@ Phase 4 polish shipped end-to-end across 15 tasks (T88–T102). Vector retrieval
|
|||||||
|
|
||||||
### Phase 4.5 / 5 backlog
|
### Phase 4.5 / 5 backlog
|
||||||
|
|
||||||
New follow-ups discovered during Phase 4 reviews and execution. None are blocking; pick up at any time.
|
All items shipped or deferred to Phase 5 (see "Phase 5 backlog" below). Final schema version: 14.
|
||||||
|
|
||||||
#### From T88 review
|
## Phase 4.5 status
|
||||||
|
|
||||||
- **`embeddings` FK lacks `ON DELETE CASCADE`**: deindex events are the only deletion path; if memories ever get deleted directly (raw SQL), embedding rows orphan. Defensible since projector model uses explicit deindex events, but worth a comment or `ON DELETE CASCADE` addition.
|
Phase 4.5 cleanup shipped 13 of 14 planned tasks (T103–T117 with T115 deferred; T118 is this docs sweep). Two CLAUDE.md backlogs (Phase 3.6/4, Phase 4.5/5) are now empty; deferred follow-ups discovered during execution are tracked in a new "Phase 5 backlog" section below. Schema baseline advanced from version 13 to **14** (migration 0014: `memories.event_id`). Test count grew from ~413 (Phase 4) to ~457 (+~44 new tests across the wave).
|
||||||
|
|
||||||
#### From T89 review
|
- **Wave 1 — trivial polish (parallel)**:
|
||||||
|
- **T103** branches polish — global-branch (`chat_id IS NULL`) leak documented in `list_branches`; branch-switch to nonexistent name now logs a warning.
|
||||||
|
- **T104** `memory.py` DRY — `MAX(id)` helper extracted; `fts_rank=None` contract documented for vector-only rows.
|
||||||
|
- **T105** `snapshots.py` polish — `datetime`/`timezone` imports hoisted to module level; strict `kind` validation in restore/preview (rejects missing); `created_at` from file mtime documented.
|
||||||
|
- **T106** `search.py` polish — `k=50` extracted to module constant; N+1 `get_bot`/`get_chat`/`get_scene` lookups batched.
|
||||||
|
- **T107** `embeddings.py` — `timeout_s` fallback-path warning when non-default model misconfigured.
|
||||||
|
- **Wave 2 — scene-close-on-cancel (single)**:
|
||||||
|
- **T108** strengthened the T74.3 regression test + documented rationale in `turns.py`. **Surfaced a deferred bug**: existing pin only passes because `asyncio` isn't imported in the test module (NameError caught instead of CancelledError). When CancelledError fires for real, `post_turn`'s end-of-function re-raise causes `open_db`'s dependency teardown to skip `conn.commit()`, rolling back ALL post-cancel writes. Documented and deferred to Phase 5 triage.
|
||||||
|
- **Wave 3 — schema 0014 (single)**:
|
||||||
|
- **T109** `memories.event_id` column (foundation for T111 deep-link). FK CASCADE on `embeddings.memory_id` deferred (memories rows are never deleted today; defensive constraint can't fire — saved for broader migration cleanup in Phase 5).
|
||||||
|
- **Wave 4 — drawer Phase 4.5 bundle (single)**:
|
||||||
|
- **T110** `event_id <= 0` guard in `delete_turn` + `html.escape()` on delete-impact modal + Jinja partial extraction + bulk significance re-rate per chat (one `manual_edit` event per memory).
|
||||||
|
- **Wave 5 — search UX (single)**:
|
||||||
|
- **T111** FTS snippet highlighting via `snippet()` + deep-link to turn via `memories.event_id`.
|
||||||
|
- **Wave 6 — real embedding model swap (single)**:
|
||||||
|
- **T112** `LLMClient.embed()` Protocol + Mock impl with `canned_embeddings` + `FeatherlessClient.embed()` (raises `NotImplementedError` — Featherless OAI-compat doesn't expose embeddings, gap documented) + `generate_embedding` routes non-default models through `client.embed()` with fallback + `--re-embed-all` backfill flag.
|
||||||
|
- **Wave 7 — branching read-side filter (single)**:
|
||||||
|
- **T113** `active_branch_event_ids(conn)` helper + applied to `read_recent_dialogue`, `scene_summarize._read_recent_dialogue`, `search_memories`, and `meanwhile._read_recent_meanwhile_dialogue`. Cross-chat search and projector queries deliberately NOT filtered (cross-chat is by design; projectors must see full log). Bootstrap "main" branch (origin=0, head=0) detected as the no-clamp sentinel.
|
||||||
|
- **Wave 8 — regenerate lifecycle rollback (single)**:
|
||||||
|
- **T114** `triggered_by_assistant_turn_id` payload back-reference on `event_started`/`event_completed`/`event_cancelled` + new `event_status_reverted` event kind + projector handler in `chat/state/events.py` + regenerate flow emits revert events for affected lifecycle transitions.
|
||||||
|
- **Wave 9 — final polish + integration (parallel)**:
|
||||||
|
- **T115** sqlite-vec swap — **DEFERRED to Phase 5**. Pre-flight failed: host Python build doesn't expose `sqlite3.Connection.enable_load_extension` (raises `AttributeError`). Requires either Python rebuild with `--enable-loadable-sqlite-extensions` or migration to `apsw`. Phase 4 pure-Python cosine remains in production.
|
||||||
|
- **T116** structured `CannedQueue` test fixture builder + 2–3 POC test migrations (Phase 5 to migrate the rest).
|
||||||
|
- **T117** Phase 4.5 cross-feature integration tests (5 minimum: real embedding swap, branching read-side filter, lifecycle rollback, search deep-link, bulk significance re-rate).
|
||||||
|
- **T118** documentation (this section).
|
||||||
|
|
||||||
- **`list_branches(chat_id=...)` filter leaks global branches** (`chat_id IS NULL`) into every chat scope. Intentional? Document.
|
### Phase 5 backlog
|
||||||
- **Branch-switch to nonexistent silently leaves zero active branches** — log a warning when this would happen.
|
|
||||||
|
|
||||||
#### From T91 review
|
New follow-ups discovered during Phase 4.5 reviews and execution, plus carry-over deferrals. None are blocking; pick up at any time.
|
||||||
|
|
||||||
- **Real embedding model swap**: Phase 4 ships pseudo-embedding (deterministic SHA-256 hash). Phase 4.5+ should swap to a real model (Featherless `bge-small-en-v1.5` if available; or local `sentence-transformers/all-MiniLM-L6-v2`). The 384-dim is hardcoded in `0012_embeddings.sql`; if dim changes, migrate first.
|
- **T115 sqlite-vec swap** (environmental blocker): host Python's `sqlite3.Connection` does not expose `enable_load_extension` — `python -c "import sqlite3; sqlite3.connect(':memory:').enable_load_extension(True)"` raises `AttributeError`. Fix requires either a Python rebuild with `--enable-loadable-sqlite-extensions` or migration to `apsw`. Pure-Python cosine remains in production until then.
|
||||||
- **`timeout_s` unused on pseudo path** — fine, but log when non-default model falls through to fallback so misconfigured callers don't silently degrade.
|
- **T108 follow-up: cancel-path commit bug** — `post_turn`'s re-raised `CancelledError` causes `open_db` dependency teardown to skip `conn.commit()`, rolling back all post-cancel writes. The existing T74.3 regression test passes only because `asyncio` isn't imported in the test module (NameError masks the real cancel path). Triage required — either commit before re-raise, or restructure the route to never re-raise after the close-detection branch.
|
||||||
|
- **`embeddings` FK CASCADE on `memory_id`** — deferred from T109; do as part of a broader migration consolidation in Phase 5.
|
||||||
#### From T96 review
|
- **`CannedQueue` fixture migration** — T116 shipped the builder + POC migrations; remaining tests still use positional canned arrays. Migrate in Phase 5.
|
||||||
|
- **Vector index optimization (HNSW)** — currently scales to a few thousand memories on the flat-index pure-Python cosine path; revisit when counts grow past flat-index feasibility.
|
||||||
- **Duplicate `MAX(id)` lookup** between `_composite_rerank` and the fused-path tail — DRY follow-up.
|
- **Branch-isolated `event_log`** — each branch has its own physical `event_log` range vs the current shared id space + head filter; full branch isolation is Phase 5+.
|
||||||
- **`fts_rank=None` for vector-only rows** — document downstream contract.
|
- **Embedding model swap migration tooling** — T112 added `--re-embed-all`; a more orchestrated swap (drain old worker, re-seed all memories, swap config) is Phase 5+.
|
||||||
|
- **Real-time collaborative branching** (multi-user) — out of scope for v1.
|
||||||
#### From T98 review
|
- **Avatars / portraits** (multimodality) — deferred indefinitely per design §14.
|
||||||
|
|
||||||
- **`event_id <= 0` guard in `delete_turn`** — currently silently rewinds everything if `event_id` is 0. Add `if event_id <= 0: 400`.
|
|
||||||
- **`html.escape()` on `compute_delete_impact` output rendered into the modal** — defense in depth (currently model-controlled strings, but if event payload fields ever appear in descriptions, autoescape needed).
|
|
||||||
- **Extract delete-impact modal HTML to a Jinja partial** — testability + autoescape inheritance.
|
|
||||||
|
|
||||||
#### From T99 review
|
|
||||||
|
|
||||||
- **Hoist `datetime`/`timezone` imports to module level** in `chat/web/snapshots.py`.
|
|
||||||
- **`kind` defaulting in restore/preview** — reject missing `kind` rather than silent 404.
|
|
||||||
- **`created_at` from file mtime** vs filename-encoded timestamp — small drift if files copied; document.
|
|
||||||
|
|
||||||
#### From T100 review
|
|
||||||
|
|
||||||
- **Hardcoded `k=50`** — extract to module constant.
|
|
||||||
- **N+1 lookups (`get_bot`/`get_chat`/`get_scene` per row)** — fine at `k=50`, revisit if `k` grows.
|
|
||||||
- **FTS highlighting via `snippet()`** — Phase 4 skipped this; UX nice-to-have.
|
|
||||||
- **Result links chat-level only** — `memories` table has no `event_id` column; deep-linking to specific turn requires schema addition.
|
|
||||||
|
|
||||||
#### Deferred items
|
|
||||||
|
|
||||||
- **sqlite-vec swap** when host Python supports `enable_load_extension`.
|
|
||||||
- **Real embedding model** with proper semantic similarity.
|
|
||||||
- **Branching read-side filter**: T89 ships data-model + UI but event readers don't yet consult `is_active`. Each branch is metadata-only labeled ranges. Consult-on-read is Phase 4.5+ work.
|
|
||||||
- **Bulk significance re-rate** in drawer (T98.2 deferred — only per-memory edit shipped).
|
|
||||||
- **Vector index optimization** (HNSW) — only relevant if memory counts grow past pure-Python feasibility.
|
|
||||||
- **`scene-close-on-cancel` UX revisit** (Phase 2.5 carry-over).
|
|
||||||
- **Cross-feature canned-queue brittleness fixture builder** (Phase 3 carry-over).
|
|
||||||
- **Full lifecycle-rollback in regenerate** — Phase 3.5 T83.4 shipped a warning log; proper rollback needs schema-level back-references (`triggered_by_assistant_turn_id` payload field).
|
|
||||||
|
|||||||
+33
-4
@@ -72,17 +72,40 @@ async def lifespan(app: FastAPI):
|
|||||||
# (free / lower paid tiers cap at 2). Shared across all
|
# (free / lower paid tiers cap at 2). Shared across all
|
||||||
# FeatherlessClient instances in the process.
|
# FeatherlessClient instances in the process.
|
||||||
from chat.llm.featherless import FeatherlessClient
|
from chat.llm.featherless import FeatherlessClient
|
||||||
|
from chat.llm.local_mlx import LocalMLXClient
|
||||||
|
from chat.llm.router import RoutedLLMClient
|
||||||
|
|
||||||
FeatherlessClient.configure_concurrency(settings.featherless_max_concurrent)
|
FeatherlessClient.configure_concurrency(settings.featherless_max_concurrent)
|
||||||
|
LocalMLXClient.configure_concurrency(settings.local_mlx_max_concurrent)
|
||||||
|
|
||||||
# Background worker for the async significance pass (T22). Each job
|
# Background workers (significance scoring, embedding indexer)
|
||||||
# constructs a fresh FeatherlessClient via the factory; tests can
|
# construct a fresh client per job via the factory. Workers route
|
||||||
# disable enqueue by toggling ``app.state.background_worker.enabled``.
|
# through the same RoutedLLMClient as request-time traffic so the
|
||||||
|
# narrative model still goes to Featherless and the classifier +
|
||||||
|
# embeddings hit the local MLX server.
|
||||||
def _factory():
|
def _factory():
|
||||||
return FeatherlessClient(
|
narrative = FeatherlessClient(
|
||||||
api_key=settings.featherless_api_key,
|
api_key=settings.featherless_api_key,
|
||||||
base_url=settings.featherless_base_url,
|
base_url=settings.featherless_base_url,
|
||||||
)
|
)
|
||||||
|
classifier = None
|
||||||
|
if settings.classifier_provider_order:
|
||||||
|
classifier = FeatherlessClient(
|
||||||
|
api_key=settings.featherless_api_key,
|
||||||
|
base_url=settings.featherless_base_url,
|
||||||
|
default_extra_body={
|
||||||
|
"provider": {
|
||||||
|
"order": list(settings.classifier_provider_order)
|
||||||
|
}
|
||||||
|
},
|
||||||
|
)
|
||||||
|
local = LocalMLXClient(base_url=settings.local_mlx_base_url)
|
||||||
|
return RoutedLLMClient(
|
||||||
|
narrative=narrative,
|
||||||
|
classifier=classifier,
|
||||||
|
local=local,
|
||||||
|
narrative_model=settings.narrative_model,
|
||||||
|
)
|
||||||
|
|
||||||
worker = BackgroundWorker(settings, llm_client_factory=_factory)
|
worker = BackgroundWorker(settings, llm_client_factory=_factory)
|
||||||
await worker.start()
|
await worker.start()
|
||||||
@@ -94,9 +117,15 @@ async def lifespan(app: FastAPI):
|
|||||||
# Phase 4's pseudo-embedding path is local so the worker doesn't need
|
# Phase 4's pseudo-embedding path is local so the worker doesn't need
|
||||||
# an LLM client; we still pass one so the Phase 4.5 swap to a real
|
# an LLM client; we still pass one so the Phase 4.5 swap to a real
|
||||||
# model is a one-line change.
|
# model is a one-line change.
|
||||||
|
# T112 (Phase 4.5): the embedding model is now configurable via
|
||||||
|
# ``Settings.embedding_model``. Default ``"pseudo-sha256-384"``
|
||||||
|
# keeps the local-only path; swapping to a real model routes
|
||||||
|
# through ``client.embed(...)`` and falls back to a zero vector
|
||||||
|
# plus warning if the provider doesn't support embeddings.
|
||||||
embedding_worker = EmbeddingWorker(
|
embedding_worker = EmbeddingWorker(
|
||||||
conn_factory=lambda: open_db(settings.db_path),
|
conn_factory=lambda: open_db(settings.db_path),
|
||||||
client=_factory(),
|
client=_factory(),
|
||||||
|
model=settings.embedding_model,
|
||||||
)
|
)
|
||||||
await embedding_worker.start()
|
await embedding_worker.start()
|
||||||
app.state.embedding_worker = embedding_worker
|
app.state.embedding_worker = embedding_worker
|
||||||
|
|||||||
+36
-6
@@ -23,13 +23,22 @@ class Settings(BaseModel):
|
|||||||
retrieval_k: int = 4
|
retrieval_k: int = 4
|
||||||
narrative_budget_hard: int = 8000
|
narrative_budget_hard: int = 8000
|
||||||
narrative_budget_soft: int = 6000
|
narrative_budget_soft: int = 6000
|
||||||
# Cap on each generated bot response. ~400 tokens ≈ 1–2 short paragraphs.
|
# Cap on each generated bot response. The asterisk-action format
|
||||||
# Bump if you want longer scenes; drop to 200 for terse banter.
|
# (see ``_closing_instruction`` in chat/services/prompt.py) targets
|
||||||
narrative_max_tokens: int = 400
|
# 2-3 short interleaved action+dialogue beats. Verbose roleplay
|
||||||
|
# narrators (Cydonia, Magnum) ignore the prompt's cap and keep
|
||||||
|
# going; ``trim_to_max_beats`` in chat/services/prompt.py handles
|
||||||
|
# the actual cap by trimming at a beat boundary post-stream. This
|
||||||
|
# max_tokens setting just gives the third beat enough room to
|
||||||
|
# complete naturally before max_tokens cuts mid-action: 160 fits
|
||||||
|
# 3 substantive beats with margin. Bump to 250 for longer scenes;
|
||||||
|
# drop to 80 for terse banter.
|
||||||
|
narrative_max_tokens: int = 160
|
||||||
# Sampling temperature for narrative generation. 0.7 = grounded /
|
# Sampling temperature for narrative generation. 0.7 = grounded /
|
||||||
# consistent; 0.85 = creative-but-in-character (default); 1.0 = wide
|
# instruction-compliant (current — Cydonia is verbose-by-default and
|
||||||
# variety, can drift; >1.0 = often off-the-rails.
|
# tighter temperature helps it respect the 2-3-beat cap);
|
||||||
narrative_temperature: float = 0.85
|
# 0.85 = creative; 1.0 = wide variety; >1.0 = often off-the-rails.
|
||||||
|
narrative_temperature: float = 0.7
|
||||||
classifier_budget_hard: int = 4000
|
classifier_budget_hard: int = 4000
|
||||||
classifier_timeout_s: float = 30.0
|
classifier_timeout_s: float = 30.0
|
||||||
# Featherless free tier and lower paid tiers cap concurrent connections.
|
# Featherless free tier and lower paid tiers cap concurrent connections.
|
||||||
@@ -39,6 +48,27 @@ class Settings(BaseModel):
|
|||||||
data_dir: Path = REPO_ROOT / "data"
|
data_dir: Path = REPO_ROOT / "data"
|
||||||
bind_host: str = "127.0.0.1"
|
bind_host: str = "127.0.0.1"
|
||||||
bind_port: int = 8000
|
bind_port: int = 8000
|
||||||
|
# Local MLX server (e.g. ``mlx-omni-server``) — serves any model
|
||||||
|
# whose id starts with one of ``local_prefixes`` (default
|
||||||
|
# ``"mlx-community/"``). The :class:`RoutedLLMClient` inspects the
|
||||||
|
# ``model`` kwarg at call time: local-prefix -> local, else -> remote.
|
||||||
|
# ``embed()`` always routes local.
|
||||||
|
local_mlx_base_url: str = "http://127.0.0.1:10240/v1"
|
||||||
|
local_mlx_max_concurrent: int = 1
|
||||||
|
# Optional OpenRouter-style provider pinning for the classifier
|
||||||
|
# client. Maps to the ``provider`` field on chat.completions.create
|
||||||
|
# via ``extra_body``; the FeatherlessClient (which is just an
|
||||||
|
# AsyncOpenAI wrapper) merges it into every call. Useful for forcing
|
||||||
|
# Llama-3.1-8B classifier traffic onto Cerebras (~423 tok/s, 10x
|
||||||
|
# the default Nebius). Empty list = no pin (provider is
|
||||||
|
# OpenRouter's choice).
|
||||||
|
classifier_provider_order: list[str] = Field(default_factory=list)
|
||||||
|
# T112 (Phase 4.5): embedding model identifier. Default is the
|
||||||
|
# deterministic local pseudo so fresh installs / tests don't need
|
||||||
|
# any external infra. Override via config.toml to a real model id
|
||||||
|
# (e.g. ``"mlx-community/bge-small-en-v1.5-bf16"``) once a local
|
||||||
|
# MLX server is running.
|
||||||
|
embedding_model: str = "pseudo-sha256-384"
|
||||||
|
|
||||||
def load_settings() -> Settings:
|
def load_settings() -> Settings:
|
||||||
config_path = Path(os.environ.get("CHAT_CONFIG_PATH", DEFAULT_CONFIG))
|
config_path = Path(os.environ.get("CHAT_CONFIG_PATH", DEFAULT_CONFIG))
|
||||||
|
|||||||
+14
-1
@@ -7,7 +7,20 @@ from pathlib import Path
|
|||||||
@contextmanager
|
@contextmanager
|
||||||
def open_db(path: Path, *, check_same_thread: bool = True):
|
def open_db(path: Path, *, check_same_thread: bool = True):
|
||||||
path.parent.mkdir(parents=True, exist_ok=True)
|
path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
conn = sqlite3.connect(path, check_same_thread=check_same_thread)
|
# ``timeout`` here sets SQLite's busy_timeout, in seconds: how long
|
||||||
|
# ``conn.execute`` blocks when another connection holds the WAL
|
||||||
|
# write lock. The Python default is 5.0, which is fatal for the
|
||||||
|
# async chat app: ``conn.execute``'s busy-wait does NOT release the
|
||||||
|
# GIL, so a contending background worker (e.g. the embedding worker
|
||||||
|
# writing ``embedding_indexed`` while the request handler holds an
|
||||||
|
# open transaction) freezes the whole asyncio event loop for up to
|
||||||
|
# 5 seconds — silently turning every concurrent LLM call into a 5s
|
||||||
|
# wall-clock hit. 0.1s lets contending writers fail fast; callers
|
||||||
|
# that need durability should retry, and the embedding worker
|
||||||
|
# already logs failures so a missed embedding can be backfilled.
|
||||||
|
conn = sqlite3.connect(
|
||||||
|
path, check_same_thread=check_same_thread, timeout=0.1
|
||||||
|
)
|
||||||
conn.execute("PRAGMA journal_mode=WAL")
|
conn.execute("PRAGMA journal_mode=WAL")
|
||||||
conn.execute("PRAGMA foreign_keys=ON")
|
conn.execute("PRAGMA foreign_keys=ON")
|
||||||
try:
|
try:
|
||||||
|
|||||||
@@ -0,0 +1,25 @@
|
|||||||
|
-- 0014_phase45_schema.sql — Phase 4.5 Wave 2 schema bump (T109).
|
||||||
|
--
|
||||||
|
-- Two schema concerns are bundled into this migration:
|
||||||
|
--
|
||||||
|
-- 1. ``embeddings.memory_id`` FK should ideally carry ``ON DELETE CASCADE``
|
||||||
|
-- (T88 review nit). DEFERRED to Phase 5: ``embeddings`` rows are only ever
|
||||||
|
-- deleted when the parent ``memories`` row is deleted, and ``memories``
|
||||||
|
-- rows are never deleted today (memory hide is a soft flag; the surgical
|
||||||
|
-- ``deindex_event`` path operates on ``event_log`` and does NOT cascade
|
||||||
|
-- to projection rows). The CASCADE constraint therefore can't fire under
|
||||||
|
-- current usage — adding the SQLite table-rebuild dance (rename, recreate,
|
||||||
|
-- copy, drop, reindex) for a defensive constraint is unwarranted bloat
|
||||||
|
-- in a polish wave. Revisit during the broader Phase 5 migration cleanup
|
||||||
|
-- when other table reshapes make the rebuild worthwhile.
|
||||||
|
--
|
||||||
|
-- 2. Add ``memories.event_id`` (NULLABLE INTEGER, references ``event_log.id``)
|
||||||
|
-- so cross-chat search results can deep-link back to the originating
|
||||||
|
-- turn (foundation for T111). The column is nullable so historical
|
||||||
|
-- memory rows projected before 0014 ran continue to round-trip cleanly;
|
||||||
|
-- new rows are populated by the ``memory_written`` projector handler
|
||||||
|
-- from the projecting event's id. This is a pure additive change — no
|
||||||
|
-- backfill is performed. Older rows simply read NULL until/unless a
|
||||||
|
-- later migration backfills them; T111 surfaces are coded to accept
|
||||||
|
-- NULL gracefully (no deep-link rendered).
|
||||||
|
ALTER TABLE memories ADD COLUMN event_id INTEGER REFERENCES event_log(id);
|
||||||
+53
-2
@@ -1,8 +1,13 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
import asyncio
|
||||||
import json
|
import json
|
||||||
|
import logging
|
||||||
from dataclasses import dataclass
|
from dataclasses import dataclass
|
||||||
from typing import Any, Iterator
|
from typing import Any, Callable, ContextManager, Iterator
|
||||||
from sqlite3 import Connection
|
from sqlite3 import Connection, OperationalError
|
||||||
|
|
||||||
|
|
||||||
|
_log = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
@@ -63,6 +68,52 @@ def append_and_apply(
|
|||||||
return eid
|
return eid
|
||||||
|
|
||||||
|
|
||||||
|
async def append_and_apply_with_retry(
|
||||||
|
conn_factory: Callable[[], ContextManager[Connection]],
|
||||||
|
*,
|
||||||
|
kind: str,
|
||||||
|
payload: dict[str, Any],
|
||||||
|
branch_id: int = 1,
|
||||||
|
attempts: int = 30,
|
||||||
|
base_sleep_s: float = 0.05,
|
||||||
|
max_sleep_s: float = 0.5,
|
||||||
|
) -> int | None:
|
||||||
|
"""Append-and-apply that retries on ``database is locked``.
|
||||||
|
|
||||||
|
Background workers (embedding indexer, significance scorer) write
|
||||||
|
events to the same SQLite file as the request handler. The chat
|
||||||
|
app sets a tight ``busy_timeout=100ms`` on every connection so a
|
||||||
|
contending worker can't freeze the request's asyncio event loop.
|
||||||
|
This helper restores durability for workers: it retries up to
|
||||||
|
``attempts`` times with exponential backoff (capped at
|
||||||
|
``max_sleep_s``) until the lock clears.
|
||||||
|
|
||||||
|
Returns the appended event's id, or ``None`` if all retries failed
|
||||||
|
(logged at WARNING). Each retry opens a fresh connection via
|
||||||
|
``conn_factory`` because the failed write may have left the prior
|
||||||
|
connection in an unusable state.
|
||||||
|
"""
|
||||||
|
sleep = base_sleep_s
|
||||||
|
for attempt in range(attempts):
|
||||||
|
try:
|
||||||
|
with conn_factory() as conn:
|
||||||
|
return append_and_apply(
|
||||||
|
conn, kind=kind, payload=payload, branch_id=branch_id
|
||||||
|
)
|
||||||
|
except OperationalError as exc:
|
||||||
|
if "database is locked" not in str(exc).lower():
|
||||||
|
raise
|
||||||
|
if attempt == attempts - 1:
|
||||||
|
_log.warning(
|
||||||
|
"append_and_apply_with_retry: gave up after %d attempts "
|
||||||
|
"(kind=%s): %s",
|
||||||
|
attempts, kind, exc,
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
await asyncio.sleep(sleep)
|
||||||
|
sleep = min(sleep * 2, max_sleep_s)
|
||||||
|
|
||||||
|
|
||||||
def read_events(conn: Connection, branch_id: int = 1, after_id: int = 0) -> Iterator[Event]:
|
def read_events(conn: Connection, branch_id: int = 1, after_id: int = 0) -> Iterator[Event]:
|
||||||
cur = conn.execute(
|
cur = conn.execute(
|
||||||
"SELECT id, branch_id, ts, kind, payload_json, superseded_by, hidden "
|
"SELECT id, branch_id, ts, kind, payload_json, superseded_by, hidden "
|
||||||
|
|||||||
+30
-2
@@ -1,11 +1,13 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
import json
|
import json
|
||||||
import asyncio
|
import asyncio
|
||||||
|
import logging
|
||||||
from typing import TypeVar
|
from typing import TypeVar
|
||||||
from pydantic import BaseModel, ValidationError
|
from pydantic import BaseModel, ValidationError
|
||||||
from .client import LLMClient, Message
|
from .client import LLMClient, Message
|
||||||
|
|
||||||
T = TypeVar("T", bound=BaseModel)
|
T = TypeVar("T", bound=BaseModel)
|
||||||
|
_log = logging.getLogger(__name__)
|
||||||
|
|
||||||
REFUSAL_PATTERNS = ("i can't", "i cannot", "i'm sorry, but", "as an ai")
|
REFUSAL_PATTERNS = ("i can't", "i cannot", "i'm sorry, but", "as an ai")
|
||||||
|
|
||||||
@@ -31,6 +33,7 @@ async def classify(
|
|||||||
schema: type[T],
|
schema: type[T],
|
||||||
default: T | None = None,
|
default: T | None = None,
|
||||||
timeout_s: float = 10.0,
|
timeout_s: float = 10.0,
|
||||||
|
max_tokens: int = 512,
|
||||||
) -> T:
|
) -> T:
|
||||||
schema_json = json.dumps(schema.model_json_schema(), indent=2)
|
schema_json = json.dumps(schema.model_json_schema(), indent=2)
|
||||||
schema_block = (
|
schema_block = (
|
||||||
@@ -41,22 +44,47 @@ async def classify(
|
|||||||
Message(role="system", content=system + schema_block),
|
Message(role="system", content=system + schema_block),
|
||||||
Message(role="user", content=user),
|
Message(role="user", content=user),
|
||||||
]
|
]
|
||||||
|
# Cap output length so a misbehaving model (e.g. one that ignores
|
||||||
|
# ``response_format=json_object`` and generates prose) can't burn
|
||||||
|
# several seconds on tokens we'll never use. Classifier responses
|
||||||
|
# are small JSON objects — 512 tokens is generous; usual completions
|
||||||
|
# are 50-150.
|
||||||
|
last_text = None
|
||||||
|
last_error: BaseException | None = None
|
||||||
for attempt in range(3):
|
for attempt in range(3):
|
||||||
try:
|
try:
|
||||||
text = await asyncio.wait_for(
|
text = await asyncio.wait_for(
|
||||||
client.generate(msgs, model=model, response_format={"type": "json_object"}),
|
client.generate(
|
||||||
|
msgs,
|
||||||
|
model=model,
|
||||||
|
response_format={"type": "json_object"},
|
||||||
|
max_tokens=max_tokens,
|
||||||
|
),
|
||||||
timeout=timeout_s,
|
timeout=timeout_s,
|
||||||
)
|
)
|
||||||
|
last_text = text
|
||||||
cleaned = _strip_json_fences(text)
|
cleaned = _strip_json_fences(text)
|
||||||
if any(p in cleaned.lower()[:80] for p in REFUSAL_PATTERNS) and not cleaned.lstrip().startswith("{"):
|
if any(p in cleaned.lower()[:80] for p in REFUSAL_PATTERNS) and not cleaned.lstrip().startswith("{"):
|
||||||
raise ValueError("refusal-shaped response")
|
raise ValueError("refusal-shaped response")
|
||||||
return schema.model_validate_json(cleaned)
|
return schema.model_validate_json(cleaned)
|
||||||
except (ValidationError, ValueError, json.JSONDecodeError, asyncio.TimeoutError):
|
except (ValidationError, ValueError, json.JSONDecodeError, asyncio.TimeoutError) as exc:
|
||||||
|
last_error = exc
|
||||||
msgs[0] = Message(
|
msgs[0] = Message(
|
||||||
role="system",
|
role="system",
|
||||||
content=system + schema_block + "\n\nRespond with valid JSON ONLY. No prose, no markdown fences.",
|
content=system + schema_block + "\n\nRespond with valid JSON ONLY. No prose, no markdown fences.",
|
||||||
)
|
)
|
||||||
continue
|
continue
|
||||||
|
# Log when we're falling back so flapping classifiers are
|
||||||
|
# diagnosable without taking down the request.
|
||||||
|
snippet = (last_text or "")[:200].replace("\n", " ")
|
||||||
|
_log.warning(
|
||||||
|
"classify(%s) exhausted 3 attempts; last_error=%s last_text=%r; "
|
||||||
|
"falling back to %s",
|
||||||
|
schema.__name__,
|
||||||
|
type(last_error).__name__ if last_error else "?",
|
||||||
|
snippet,
|
||||||
|
"default" if default is not None else "RuntimeError (no default)",
|
||||||
|
)
|
||||||
if default is None:
|
if default is None:
|
||||||
raise RuntimeError(f"classify failed for schema {schema.__name__} with no default")
|
raise RuntimeError(f"classify failed for schema {schema.__name__} with no default")
|
||||||
return default
|
return default
|
||||||
|
|||||||
@@ -12,3 +12,11 @@ class Message:
|
|||||||
class LLMClient(Protocol):
|
class LLMClient(Protocol):
|
||||||
async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str: ...
|
async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str: ...
|
||||||
def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]: ...
|
def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]: ...
|
||||||
|
# T112 (Phase 4.5): real-embedding seam. Implementations either call a
|
||||||
|
# provider's ``/v1/embeddings`` endpoint or, when the provider doesn't
|
||||||
|
# expose embeddings (e.g. Featherless today), raise ``NotImplementedError``
|
||||||
|
# so ``generate_embedding`` can catch it and degrade to the zero-vector
|
||||||
|
# fallback. The Protocol is structural, so this method only needs to
|
||||||
|
# exist on implementations; existing callers that don't use it are
|
||||||
|
# unaffected.
|
||||||
|
async def embed(self, text: str, *, model: str) -> list[float]: ...
|
||||||
|
|||||||
+73
-1
@@ -29,19 +29,60 @@ class FeatherlessClient:
|
|||||||
cls._semaphore = asyncio.Semaphore(2)
|
cls._semaphore = asyncio.Semaphore(2)
|
||||||
return cls._semaphore
|
return cls._semaphore
|
||||||
|
|
||||||
def __init__(self, api_key: str, base_url: str = "https://api.featherless.ai/v1"):
|
def __init__(
|
||||||
|
self,
|
||||||
|
api_key: str,
|
||||||
|
base_url: str = "https://api.featherless.ai/v1",
|
||||||
|
*,
|
||||||
|
default_extra_body: dict | None = None,
|
||||||
|
):
|
||||||
self._client = AsyncOpenAI(api_key=api_key, base_url=base_url)
|
self._client = AsyncOpenAI(api_key=api_key, base_url=base_url)
|
||||||
|
# ``default_extra_body`` is merged into every chat.completions.create
|
||||||
|
# call's ``extra_body``. Useful with OpenRouter to pin specific
|
||||||
|
# upstream providers (e.g. ``{"provider": {"order": ["Cerebras"]}}``
|
||||||
|
# for 10x throughput on Llama-3.1-8B). Featherless ignores the
|
||||||
|
# field, so it's safe to leave set even when ``base_url`` points
|
||||||
|
# back at Featherless.
|
||||||
|
self._default_extra_body = default_extra_body or {}
|
||||||
|
|
||||||
|
def _merge_extra_body(self, params: dict) -> dict:
|
||||||
|
if not self._default_extra_body:
|
||||||
|
return params
|
||||||
|
eb = dict(self._default_extra_body)
|
||||||
|
eb.update(params.pop("extra_body", {}) or {})
|
||||||
|
params["extra_body"] = eb
|
||||||
|
return params
|
||||||
|
|
||||||
async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str:
|
async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str:
|
||||||
|
params = self._merge_extra_body(dict(params))
|
||||||
async with self._sem():
|
async with self._sem():
|
||||||
resp = await self._client.chat.completions.create(
|
resp = await self._client.chat.completions.create(
|
||||||
model=model,
|
model=model,
|
||||||
messages=[{"role": m.role, "content": m.content} for m in messages],
|
messages=[{"role": m.role, "content": m.content} for m in messages],
|
||||||
**params,
|
**params,
|
||||||
)
|
)
|
||||||
|
# Diagnostic: stash provider+usage on a side-channel for the
|
||||||
|
# router timing log to pick up. OpenRouter sticks a 'provider'
|
||||||
|
# field on the response (not part of the OAI spec, but the
|
||||||
|
# SDK passes it through on its model dict).
|
||||||
|
try: # pragma: no cover — diagnostic only
|
||||||
|
import os as _os
|
||||||
|
if _os.environ.get("CHAT_LLM_TIMING") == "1":
|
||||||
|
prov = getattr(resp, "provider", None)
|
||||||
|
usage = getattr(resp, "usage", None)
|
||||||
|
ct = getattr(usage, "completion_tokens", "?") if usage else "?"
|
||||||
|
pt = getattr(usage, "prompt_tokens", "?") if usage else "?"
|
||||||
|
import logging as _logging
|
||||||
|
_logging.getLogger("chat.llm.router").info(
|
||||||
|
" ↪ provider=%s prompt_toks=%s completion_toks=%s",
|
||||||
|
prov, pt, ct,
|
||||||
|
)
|
||||||
|
except Exception: # pragma: no cover
|
||||||
|
pass
|
||||||
return resp.choices[0].message.content or ""
|
return resp.choices[0].message.content or ""
|
||||||
|
|
||||||
async def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]:
|
async def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]:
|
||||||
|
params = self._merge_extra_body(dict(params))
|
||||||
async with self._sem():
|
async with self._sem():
|
||||||
stream = await self._client.chat.completions.create(
|
stream = await self._client.chat.completions.create(
|
||||||
model=model,
|
model=model,
|
||||||
@@ -53,3 +94,34 @@ class FeatherlessClient:
|
|||||||
delta = chunk.choices[0].delta.content or ""
|
delta = chunk.choices[0].delta.content or ""
|
||||||
if delta:
|
if delta:
|
||||||
yield delta
|
yield delta
|
||||||
|
|
||||||
|
async def embed(self, text: str, *, model: str) -> list[float]:
|
||||||
|
"""Embeddings via Featherless — unsupported in practice.
|
||||||
|
|
||||||
|
T112 (Phase 4.5) extends the LLMClient Protocol with ``embed()``
|
||||||
|
for a future real-embedding swap. Featherless's OpenAI-compatible
|
||||||
|
surface routes ``/v1/embeddings`` (no 404), but every request
|
||||||
|
returns HTTP 500 ``{"error": {"type": "completions_error", ...}}``
|
||||||
|
— including standard names like ``text-embedding-3-small`` and
|
||||||
|
``BAAI/bge-small-en-v1.5``. ``/v1/models`` confirms it: the
|
||||||
|
catalog has no embedding-class entries, only chat/completion
|
||||||
|
classes (``llama3-*``, ``gemma3-*``, ``glm5-*``, etc.).
|
||||||
|
|
||||||
|
Rather than ship a request that always 500s, this implementation
|
||||||
|
raises ``NotImplementedError``. The
|
||||||
|
:func:`chat.services.embeddings.generate_embedding` wrapper
|
||||||
|
catches it and degrades to the existing zero-vector fallback
|
||||||
|
(with the T107 warning), so misconfigured callers fail loudly in
|
||||||
|
logs but the request path keeps working.
|
||||||
|
|
||||||
|
For real embeddings, configure a different provider (OpenAI
|
||||||
|
direct, Cohere, Voyage, Together, self-hosted Ollama /
|
||||||
|
sentence-transformers). The Mock + routing seam from T112 keeps
|
||||||
|
the swap to a one-class change in ``chat/llm/``.
|
||||||
|
"""
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Featherless /v1/embeddings always returns 500 "
|
||||||
|
'("completions_error") and the model catalog has no '
|
||||||
|
"embedding class; configure a different embedding provider "
|
||||||
|
"or stick with the default pseudo-sha256-384 model."
|
||||||
|
)
|
||||||
|
|||||||
@@ -0,0 +1,95 @@
|
|||||||
|
"""Local MLX OpenAI-compatible client.
|
||||||
|
|
||||||
|
Talks to a locally-running MLX server (e.g., ``mlx-omni-server``) over
|
||||||
|
the same OpenAI surface that :class:`chat.llm.featherless.FeatherlessClient`
|
||||||
|
uses, via :class:`openai.AsyncOpenAI`. The underlying server runs MLX
|
||||||
|
models on Apple Silicon (M-series) for chat completions AND embeddings.
|
||||||
|
|
||||||
|
Use cases (Phase 4.5+):
|
||||||
|
- Classifier traffic moved off Featherless to local MLX (cost + latency).
|
||||||
|
- Embeddings via ``client.embed`` actually work — Featherless's
|
||||||
|
``/v1/embeddings`` always returns 500.
|
||||||
|
|
||||||
|
Constructor takes a ``base_url`` (e.g., ``"http://127.0.0.1:10240/v1"``)
|
||||||
|
and an optional ``api_key`` (most local MLX servers don't authenticate;
|
||||||
|
the OpenAI SDK requires *some* string, so we default to a placeholder).
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
import asyncio
|
||||||
|
from typing import AsyncIterator, Sequence
|
||||||
|
|
||||||
|
from openai import AsyncOpenAI
|
||||||
|
|
||||||
|
from .client import Message
|
||||||
|
|
||||||
|
|
||||||
|
class LocalMLXClient:
|
||||||
|
"""OpenAI-compatible client for a local MLX server.
|
||||||
|
|
||||||
|
The server is single-process by default (``mlx-omni-server`` loads
|
||||||
|
one model at a time and swaps on demand). The class-level semaphore
|
||||||
|
serializes concurrent requests so we never queue more than
|
||||||
|
``max_concurrent`` at a time — defaults to 1, since MLX inference
|
||||||
|
on a single M-series device is sequential anyway.
|
||||||
|
"""
|
||||||
|
|
||||||
|
_semaphore: asyncio.Semaphore | None = None
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def configure_concurrency(cls, max_concurrent: int) -> None:
|
||||||
|
cls._semaphore = asyncio.Semaphore(max(1, int(max_concurrent)))
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def _sem(cls) -> asyncio.Semaphore:
|
||||||
|
if cls._semaphore is None:
|
||||||
|
cls._semaphore = asyncio.Semaphore(1)
|
||||||
|
return cls._semaphore
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
base_url: str = "http://127.0.0.1:10240/v1",
|
||||||
|
api_key: str = "not-needed",
|
||||||
|
):
|
||||||
|
self._client = AsyncOpenAI(api_key=api_key, base_url=base_url)
|
||||||
|
|
||||||
|
async def generate(
|
||||||
|
self, messages: Sequence[Message], *, model: str, **params
|
||||||
|
) -> str:
|
||||||
|
async with self._sem():
|
||||||
|
resp = await self._client.chat.completions.create(
|
||||||
|
model=model,
|
||||||
|
messages=[{"role": m.role, "content": m.content} for m in messages],
|
||||||
|
**params,
|
||||||
|
)
|
||||||
|
return resp.choices[0].message.content or ""
|
||||||
|
|
||||||
|
async def stream(
|
||||||
|
self, messages: Sequence[Message], *, model: str, **params
|
||||||
|
) -> AsyncIterator[str]:
|
||||||
|
async with self._sem():
|
||||||
|
stream = await self._client.chat.completions.create(
|
||||||
|
model=model,
|
||||||
|
messages=[{"role": m.role, "content": m.content} for m in messages],
|
||||||
|
stream=True,
|
||||||
|
**params,
|
||||||
|
)
|
||||||
|
async for chunk in stream:
|
||||||
|
delta = chunk.choices[0].delta.content or ""
|
||||||
|
if delta:
|
||||||
|
yield delta
|
||||||
|
|
||||||
|
async def embed(self, text: str, *, model: str) -> list[float]:
|
||||||
|
"""Return an embedding vector for ``text`` using the named model.
|
||||||
|
|
||||||
|
Targets ``/v1/embeddings`` on the local MLX server; the server
|
||||||
|
loads the model on first request and caches it. The embedding
|
||||||
|
model is independent of the chat model loaded for ``generate``
|
||||||
|
/ ``stream`` (the server can serve both).
|
||||||
|
"""
|
||||||
|
async with self._sem():
|
||||||
|
resp = await self._client.embeddings.create(
|
||||||
|
model=model,
|
||||||
|
input=text,
|
||||||
|
)
|
||||||
|
return list(resp.data[0].embedding)
|
||||||
+21
-1
@@ -4,8 +4,23 @@ from .client import Message
|
|||||||
|
|
||||||
|
|
||||||
class MockLLMClient:
|
class MockLLMClient:
|
||||||
def __init__(self, canned: list[str]):
|
"""In-memory LLMClient for tests.
|
||||||
|
|
||||||
|
``canned`` feeds ``generate``/``stream`` (one entry per call, popped
|
||||||
|
from the front). ``canned_embeddings`` (T112, Phase 4.5) feeds
|
||||||
|
``embed`` the same way — each call pops the next vector. An empty
|
||||||
|
queue raises ``IndexError`` so misconfigured tests fail loudly
|
||||||
|
rather than returning ``None`` or hanging.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
canned: list[str],
|
||||||
|
*,
|
||||||
|
canned_embeddings: list[list[float]] | None = None,
|
||||||
|
):
|
||||||
self._canned = list(canned)
|
self._canned = list(canned)
|
||||||
|
self._canned_embeddings: list[list[float]] = list(canned_embeddings or [])
|
||||||
|
|
||||||
async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str:
|
async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str:
|
||||||
return self._canned.pop(0)
|
return self._canned.pop(0)
|
||||||
@@ -14,3 +29,8 @@ class MockLLMClient:
|
|||||||
text = self._canned.pop(0)
|
text = self._canned.pop(0)
|
||||||
for ch in text:
|
for ch in text:
|
||||||
yield ch
|
yield ch
|
||||||
|
|
||||||
|
async def embed(self, text: str, *, model: str) -> list[float]:
|
||||||
|
# Mirrors the canned-queue pattern; empty queue raises so
|
||||||
|
# misconfigured tests surface clearly instead of returning None.
|
||||||
|
return self._canned_embeddings.pop(0)
|
||||||
|
|||||||
@@ -0,0 +1,149 @@
|
|||||||
|
"""Routed LLM client — splits traffic across multiple backends by model.
|
||||||
|
|
||||||
|
Phase 4.5+ deployment: the 24B narrative model stays on Featherless,
|
||||||
|
the 8B classifier model moves to local MLX, and embeddings run on a
|
||||||
|
local BGE/MLX model. One :class:`LLMClient` interface, two underlying
|
||||||
|
backends, dispatched by the ``model`` argument at every call site.
|
||||||
|
|
||||||
|
Routing rule: requests whose ``model`` argument matches the configured
|
||||||
|
``narrative_model`` go to the narrative backend; everything else
|
||||||
|
(classifier, embeddings, future locally-hosted models) goes to the
|
||||||
|
local backend.
|
||||||
|
|
||||||
|
Set the env var ``CHAT_LLM_TIMING=1`` to log per-call timing at INFO
|
||||||
|
level. Useful for finding the slow link in a turn.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import time
|
||||||
|
from typing import AsyncIterator, Sequence
|
||||||
|
|
||||||
|
from .client import LLMClient, Message
|
||||||
|
|
||||||
|
|
||||||
|
_log = logging.getLogger(__name__)
|
||||||
|
_TIMING = os.environ.get("CHAT_LLM_TIMING") == "1"
|
||||||
|
if _TIMING and not _log.handlers:
|
||||||
|
# Wire a stderr handler when timing is enabled so the per-call
|
||||||
|
# logs show up under uvicorn (which doesn't configure non-uvicorn
|
||||||
|
# loggers by default).
|
||||||
|
_h = logging.StreamHandler()
|
||||||
|
_h.setFormatter(logging.Formatter("%(asctime)s %(levelname)s %(name)s: %(message)s"))
|
||||||
|
_log.addHandler(_h)
|
||||||
|
_log.setLevel(logging.INFO)
|
||||||
|
_log.propagate = False
|
||||||
|
|
||||||
|
|
||||||
|
class RoutedLLMClient:
|
||||||
|
"""Delegates to one of two underlying clients based on ``model``.
|
||||||
|
|
||||||
|
Routing rule: any model id starting with one of ``local_prefixes``
|
||||||
|
goes to the local backend (e.g. ``"mlx-community/"`` for models
|
||||||
|
served by ``mlx-omni-server``). Everything else — narrative model,
|
||||||
|
remote classifiers, anything on a hosted provider — routes to the
|
||||||
|
remote backend.
|
||||||
|
|
||||||
|
``embed`` always routes locally (the remote provider doesn't
|
||||||
|
expose a working ``/v1/embeddings``; see
|
||||||
|
:class:`chat.llm.featherless.FeatherlessClient.embed`).
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
narrative: LLMClient,
|
||||||
|
local: LLMClient,
|
||||||
|
narrative_model: str,
|
||||||
|
classifier: LLMClient | None = None,
|
||||||
|
local_prefixes: tuple[str, ...] = ("mlx-community/",),
|
||||||
|
) -> None:
|
||||||
|
# ``classifier`` is an optional separate backend for the
|
||||||
|
# classifier model. Useful when classifier and narrative both
|
||||||
|
# live on a remote OpenRouter-style provider but need different
|
||||||
|
# provider-pinning (e.g. Cerebras for the 8B classifier,
|
||||||
|
# default Friendli/etc. for the narrative). When ``classifier``
|
||||||
|
# is None, classifier traffic falls through to ``narrative``
|
||||||
|
# (the remote client) so old wiring keeps working.
|
||||||
|
self._narrative = narrative
|
||||||
|
self._classifier = classifier
|
||||||
|
self._local = local
|
||||||
|
self._narrative_model = narrative_model
|
||||||
|
self._local_prefixes = local_prefixes
|
||||||
|
|
||||||
|
def _pick(self, model: str) -> LLMClient:
|
||||||
|
if any(model.startswith(p) for p in self._local_prefixes):
|
||||||
|
return self._local
|
||||||
|
if model == self._narrative_model:
|
||||||
|
return self._narrative
|
||||||
|
# Anything else (most importantly, the classifier model) goes
|
||||||
|
# to the classifier client when configured, otherwise to the
|
||||||
|
# narrative remote client.
|
||||||
|
return self._classifier or self._narrative
|
||||||
|
|
||||||
|
async def generate(
|
||||||
|
self, messages: Sequence[Message], *, model: str, **params
|
||||||
|
) -> str:
|
||||||
|
client = self._pick(model)
|
||||||
|
backend = (
|
||||||
|
"narrative" if client is self._narrative else
|
||||||
|
"classifier" if client is self._classifier else
|
||||||
|
"local"
|
||||||
|
)
|
||||||
|
if not _TIMING:
|
||||||
|
return await client.generate(messages, model=model, **params)
|
||||||
|
in_chars = sum(len(m.content) for m in messages)
|
||||||
|
_log.info("LLM generate START [%s] %s in_chars=%d", backend, model, in_chars)
|
||||||
|
t0 = time.perf_counter()
|
||||||
|
try:
|
||||||
|
return await client.generate(messages, model=model, **params)
|
||||||
|
finally:
|
||||||
|
_log.info(
|
||||||
|
"LLM generate END [%s] %s in_chars=%d %.2fs",
|
||||||
|
backend, model, in_chars, time.perf_counter() - t0,
|
||||||
|
)
|
||||||
|
|
||||||
|
async def stream(
|
||||||
|
self, messages: Sequence[Message], *, model: str, **params
|
||||||
|
) -> AsyncIterator[str]:
|
||||||
|
client = self._pick(model)
|
||||||
|
backend = (
|
||||||
|
"narrative" if client is self._narrative else
|
||||||
|
"classifier" if client is self._classifier else
|
||||||
|
"local"
|
||||||
|
)
|
||||||
|
if not _TIMING:
|
||||||
|
async for chunk in client.stream(messages, model=model, **params):
|
||||||
|
yield chunk
|
||||||
|
return
|
||||||
|
t0 = time.perf_counter()
|
||||||
|
ttft = None
|
||||||
|
chars_out = 0
|
||||||
|
try:
|
||||||
|
async for chunk in client.stream(messages, model=model, **params):
|
||||||
|
if ttft is None:
|
||||||
|
ttft = time.perf_counter() - t0
|
||||||
|
chars_out += len(chunk)
|
||||||
|
yield chunk
|
||||||
|
finally:
|
||||||
|
dt = time.perf_counter() - t0
|
||||||
|
in_chars = sum(len(m.content) for m in messages)
|
||||||
|
_log.info(
|
||||||
|
"LLM stream [%s] %s in_chars=%d out_chars=%d ttft=%.2fs total=%.2fs",
|
||||||
|
backend, model, in_chars, chars_out, ttft or 0.0, dt,
|
||||||
|
)
|
||||||
|
|
||||||
|
async def embed(self, text: str, *, model: str) -> list[float]:
|
||||||
|
# Embeddings always run on the local backend — the remote
|
||||||
|
# provider doesn't expose a working ``/v1/embeddings`` endpoint.
|
||||||
|
if not _TIMING:
|
||||||
|
return await self._local.embed(text, model=model)
|
||||||
|
t0 = time.perf_counter()
|
||||||
|
try:
|
||||||
|
return await self._local.embed(text, model=model)
|
||||||
|
finally:
|
||||||
|
_log.info(
|
||||||
|
"LLM embed [local] %s in_chars=%d %.2fs",
|
||||||
|
model, len(text), time.perf_counter() - t0,
|
||||||
|
)
|
||||||
+17
-11
@@ -30,7 +30,7 @@ from typing import Callable
|
|||||||
|
|
||||||
from chat.config import Settings
|
from chat.config import Settings
|
||||||
from chat.db.connection import open_db
|
from chat.db.connection import open_db
|
||||||
from chat.eventlog.log import append_and_apply
|
from chat.eventlog.log import append_and_apply, append_and_apply_with_retry
|
||||||
from chat.llm.client import LLMClient
|
from chat.llm.client import LLMClient
|
||||||
from chat.services.backup import (
|
from chat.services.backup import (
|
||||||
prune_backups,
|
prune_backups,
|
||||||
@@ -169,16 +169,22 @@ class BackgroundWorker:
|
|||||||
narrative_text=job.narrative_text,
|
narrative_text=job.narrative_text,
|
||||||
prior_dialogue=job.prior_dialogue,
|
prior_dialogue=job.prior_dialogue,
|
||||||
)
|
)
|
||||||
with open_db(self._settings.db_path) as conn:
|
# Retry-on-lock: see chat/eventlog/log.py's
|
||||||
append_and_apply(
|
# ``append_and_apply_with_retry`` docstring for why workers
|
||||||
conn,
|
# need to retry while the request handler's open transaction
|
||||||
kind="memory_significance_set",
|
# holds the WAL write lock briefly.
|
||||||
payload={
|
appended_id = await append_and_apply_with_retry(
|
||||||
"memory_id": job.memory_id,
|
lambda: open_db(self._settings.db_path),
|
||||||
"significance": score,
|
kind="memory_significance_set",
|
||||||
},
|
payload={
|
||||||
)
|
"memory_id": job.memory_id,
|
||||||
if score >= 3:
|
"significance": score,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
# Auto-pin requires a separate connection because retry-helper
|
||||||
|
# closed its own. Skip if the significance event itself failed.
|
||||||
|
if appended_id is not None and score >= 3:
|
||||||
|
with open_db(self._settings.db_path) as conn:
|
||||||
_auto_pin_with_cap(
|
_auto_pin_with_cap(
|
||||||
conn,
|
conn,
|
||||||
owner_id=job.host_bot_id,
|
owner_id=job.host_bot_id,
|
||||||
|
|||||||
@@ -26,13 +26,28 @@ def search_all_memories(
|
|||||||
"""Search FTS5 across all owners and chats.
|
"""Search FTS5 across all owners and chats.
|
||||||
|
|
||||||
Returns rows with ``{memory_id, owner_id, chat_id, scene_id,
|
Returns rows with ``{memory_id, owner_id, chat_id, scene_id,
|
||||||
pov_summary, significance, ts, fts_rank}``, sorted by FTS5 BM25
|
event_id, pov_summary, snippet, significance, ts, fts_rank}``,
|
||||||
rank ascending (lower rank = stronger match, surfaced first).
|
sorted by FTS5 BM25 rank ascending (lower rank = stronger match,
|
||||||
|
surfaced first).
|
||||||
|
|
||||||
|
``event_id`` (T111.2 / T109) is the id of the ``event_log`` row that
|
||||||
|
drove the projecting ``memory_written`` event. May be ``None`` for
|
||||||
|
memory rows projected before the 0014 schema migration ran (the
|
||||||
|
column is nullable on purpose; T109 did not backfill historical
|
||||||
|
rows). The search-results UI uses it to deep-link to the originating
|
||||||
|
turn anchor (Phase 3.5 T86 stamps ``id="turn-{event_id}"`` on each
|
||||||
|
turn DOM node) and falls back to a chat-level link when ``None``.
|
||||||
|
|
||||||
The ``memories`` table has no ``ts`` column; we expose ``created_at``
|
The ``memories`` table has no ``ts`` column; we expose ``created_at``
|
||||||
(the projector-side row insertion timestamp) under that key so the
|
(the projector-side row insertion timestamp) under that key so the
|
||||||
UI does not have to know the storage name.
|
UI does not have to know the storage name.
|
||||||
|
|
||||||
|
``snippet`` (T111.1) is the FTS5 ``snippet()`` output for the
|
||||||
|
matched ``pov_summary`` column: a windowed excerpt with each match
|
||||||
|
token wrapped in ``<mark>...</mark>`` for the search-results UI to
|
||||||
|
render verbatim. The full ``pov_summary`` is also returned so
|
||||||
|
non-highlighted callers (or fallbacks) keep the original string.
|
||||||
|
|
||||||
An empty / whitespace-only ``query`` short-circuits to ``[]`` to
|
An empty / whitespace-only ``query`` short-circuits to ``[]`` to
|
||||||
avoid an FTS5 ``MATCH ''`` syntax error and to keep the top-bar
|
avoid an FTS5 ``MATCH ''`` syntax error and to keep the top-bar
|
||||||
"no input yet" state from triggering a full-table scan.
|
"no input yet" state from triggering a full-table scan.
|
||||||
@@ -45,9 +60,20 @@ def search_all_memories(
|
|||||||
# from the content table because the FTS index only stores
|
# from the content table because the FTS index only stores
|
||||||
# ``pov_summary``. ORDER BY rank ASC because BM25 in FTS5 returns
|
# ``pov_summary``. ORDER BY rank ASC because BM25 in FTS5 returns
|
||||||
# negative scores where lower is better.
|
# negative scores where lower is better.
|
||||||
|
#
|
||||||
|
# ``snippet(memories_fts, 0, ...)`` (T111.1) targets column 0 of the
|
||||||
|
# FTS virtual table, which is ``pov_summary`` (the only column
|
||||||
|
# indexed by ``CREATE VIRTUAL TABLE memories_fts USING fts5(
|
||||||
|
# pov_summary, ...)`` in migration 0006). SQLite passes the raw
|
||||||
|
# column text through verbatim aside from inserting the configured
|
||||||
|
# before/after match markers, so the only HTML in the output is the
|
||||||
|
# ``<mark>`` we injected — safe to render with ``|safe`` server-side.
|
||||||
rows = conn.execute(
|
rows = conn.execute(
|
||||||
"SELECT m.id, m.owner_id, m.chat_id, m.scene_id, "
|
"SELECT m.id, m.owner_id, m.chat_id, m.scene_id, m.event_id, "
|
||||||
" m.pov_summary, m.significance, m.created_at, "
|
" m.pov_summary, "
|
||||||
|
" snippet(memories_fts, 0, '<mark>', '</mark>', '…', 32) "
|
||||||
|
" AS snippet, "
|
||||||
|
" m.significance, m.created_at, "
|
||||||
" memories_fts.rank "
|
" memories_fts.rank "
|
||||||
"FROM memories_fts "
|
"FROM memories_fts "
|
||||||
"JOIN memories m ON m.id = memories_fts.rowid "
|
"JOIN memories m ON m.id = memories_fts.rowid "
|
||||||
@@ -63,10 +89,12 @@ def search_all_memories(
|
|||||||
"owner_id": r[1],
|
"owner_id": r[1],
|
||||||
"chat_id": r[2],
|
"chat_id": r[2],
|
||||||
"scene_id": r[3],
|
"scene_id": r[3],
|
||||||
"pov_summary": r[4],
|
"event_id": r[4],
|
||||||
"significance": r[5],
|
"pov_summary": r[5],
|
||||||
"ts": r[6],
|
"snippet": r[6],
|
||||||
"fts_rank": r[7],
|
"significance": r[7],
|
||||||
|
"ts": r[8],
|
||||||
|
"fts_rank": r[9],
|
||||||
}
|
}
|
||||||
for r in rows
|
for r in rows
|
||||||
]
|
]
|
||||||
|
|||||||
@@ -26,7 +26,7 @@ from dataclasses import dataclass
|
|||||||
from sqlite3 import Connection
|
from sqlite3 import Connection
|
||||||
from typing import Callable
|
from typing import Callable
|
||||||
|
|
||||||
from chat.eventlog.log import append_and_apply
|
from chat.eventlog.log import append_and_apply_with_retry
|
||||||
from chat.services.embeddings import (
|
from chat.services.embeddings import (
|
||||||
DEFAULT_EMBEDDING_DIM,
|
DEFAULT_EMBEDDING_DIM,
|
||||||
DEFAULT_EMBEDDING_MODEL,
|
DEFAULT_EMBEDDING_MODEL,
|
||||||
@@ -121,17 +121,22 @@ class EmbeddingWorker:
|
|||||||
job.memory_id,
|
job.memory_id,
|
||||||
)
|
)
|
||||||
return
|
return
|
||||||
with self._conn_factory() as conn:
|
# Retry-on-lock: the request handler holds an open transaction
|
||||||
append_and_apply(
|
# for the duration of post_turn (a few seconds), so any worker
|
||||||
conn,
|
# write started during that window blocks. open_db's
|
||||||
kind="embedding_indexed",
|
# busy_timeout is 100ms (so the request path itself can't get
|
||||||
payload={
|
# stuck on a worker), so retry here with backoff. Each retry
|
||||||
"memory_id": job.memory_id,
|
# opens a fresh connection via ``conn_factory``.
|
||||||
"model": result.model,
|
await append_and_apply_with_retry(
|
||||||
"dim": result.dim,
|
self._conn_factory,
|
||||||
"vector": result.vector,
|
kind="embedding_indexed",
|
||||||
},
|
payload={
|
||||||
)
|
"memory_id": job.memory_id,
|
||||||
|
"model": result.model,
|
||||||
|
"dim": result.dim,
|
||||||
|
"vector": result.vector,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
__all__ = ["EmbeddingJob", "EmbeddingWorker"]
|
__all__ = ["EmbeddingJob", "EmbeddingWorker"]
|
||||||
|
|||||||
@@ -10,6 +10,7 @@ EmbeddingResult shape stays the same, only the generator changes.
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import hashlib
|
import hashlib
|
||||||
|
import logging
|
||||||
import math
|
import math
|
||||||
import struct
|
import struct
|
||||||
|
|
||||||
@@ -18,6 +19,8 @@ from pydantic import BaseModel
|
|||||||
from chat.llm.client import LLMClient
|
from chat.llm.client import LLMClient
|
||||||
|
|
||||||
|
|
||||||
|
_log = logging.getLogger(__name__)
|
||||||
|
|
||||||
DEFAULT_EMBEDDING_DIM = 384
|
DEFAULT_EMBEDDING_DIM = 384
|
||||||
DEFAULT_EMBEDDING_MODEL = "pseudo-sha256-384"
|
DEFAULT_EMBEDDING_MODEL = "pseudo-sha256-384"
|
||||||
FALLBACK_EMBEDDING_MODEL = "fallback"
|
FALLBACK_EMBEDDING_MODEL = "fallback"
|
||||||
@@ -92,11 +95,27 @@ async def generate_embedding(
|
|||||||
# Pure-local pseudo path — no LLMClient call.
|
# Pure-local pseudo path — no LLMClient call.
|
||||||
return EmbeddingResult(vector=_pseudo_embed(text, dim), model=model, dim=dim)
|
return EmbeddingResult(vector=_pseudo_embed(text, dim), model=model, dim=dim)
|
||||||
|
|
||||||
# Future: real embedding via client.embed(...). Phase 4.5 work.
|
# T112 (Phase 4.5): non-default model — route through the client's
|
||||||
# For Phase 4, any non-default model falls through to fallback.
|
# ``embed()`` method. On any failure (including ``NotImplementedError``
|
||||||
return EmbeddingResult(
|
# from providers that don't expose embeddings, e.g. Featherless today),
|
||||||
vector=[0.0] * dim, model=FALLBACK_EMBEDDING_MODEL, dim=dim
|
# fall back to the zero vector and re-fire the T107 warning so
|
||||||
)
|
# misconfigured callers see the issue in logs rather than silently
|
||||||
|
# producing useless cosine results.
|
||||||
|
try:
|
||||||
|
vector = await client.embed(text, model=model)
|
||||||
|
return EmbeddingResult(vector=list(vector), model=model, dim=len(vector))
|
||||||
|
except Exception as exc: # noqa: BLE001 — any failure must degrade gracefully
|
||||||
|
_log.warning(
|
||||||
|
"generate_embedding: non-default model %r returned fallback "
|
||||||
|
"(client.embed() raised %s: %s); "
|
||||||
|
"downstream search will degrade silently. Configure a supported model.",
|
||||||
|
model,
|
||||||
|
type(exc).__name__,
|
||||||
|
exc,
|
||||||
|
)
|
||||||
|
return EmbeddingResult(
|
||||||
|
vector=[0.0] * dim, model=FALLBACK_EMBEDDING_MODEL, dim=dim
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
|
|||||||
@@ -4,13 +4,15 @@ Wraps single-pair compute_state_update to run state updates for ALL
|
|||||||
directed pairs of present entities. With 3 present entities (you, host,
|
directed pairs of present entities. With 3 present entities (you, host,
|
||||||
guest) that's 6 directed pairs. With 2 present (you, host) it's 2 pairs.
|
guest) that's 6 directed pairs. With 2 present (you, host) it's 2 pairs.
|
||||||
|
|
||||||
Calls run sequentially to respect Featherless's 2-connection cap (the
|
Pairs run concurrently via :func:`asyncio.gather`; the underlying
|
||||||
client-level semaphore would serialize them anyway, but doing it here
|
client should impose its own concurrency cap if the upstream provider
|
||||||
keeps the failure surface clean — a hung pair doesn't queue behind
|
needs it (e.g., Featherless's 2-conn semaphore). Returning order is
|
||||||
itself).
|
preserved (natural iteration over ``present_ids x present_ids``,
|
||||||
|
src != tgt) so downstream event-append order stays deterministic.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
import asyncio
|
||||||
|
|
||||||
from chat.llm.client import LLMClient
|
from chat.llm.client import LLMClient
|
||||||
from chat.services.state_update import StateUpdate, compute_state_update
|
from chat.services.state_update import StateUpdate, compute_state_update
|
||||||
@@ -28,35 +30,44 @@ async def compute_state_updates_for_present(
|
|||||||
timeout_s: float = 30.0,
|
timeout_s: float = 30.0,
|
||||||
) -> list[tuple[str, str, StateUpdate]]:
|
) -> list[tuple[str, str, StateUpdate]]:
|
||||||
"""Run compute_state_update for every directed pair (src != tgt) over
|
"""Run compute_state_update for every directed pair (src != tgt) over
|
||||||
``present_ids``. Returns list of ``(source_id, target_id, update)``
|
``present_ids``, concurrently. Returns list of
|
||||||
tuples in the natural iteration order over ``present_ids x present_ids``.
|
``(source_id, target_id, update)`` tuples in the natural iteration
|
||||||
|
order over ``present_ids x present_ids`` — concurrent dispatch does
|
||||||
|
not change the returned order.
|
||||||
|
|
||||||
A single failing pair falls back to the schema-default StateUpdate
|
A single failing pair falls back to the schema-default StateUpdate
|
||||||
(zero deltas, empty facts) inside ``compute_state_update``; the batch
|
(zero deltas, empty facts) inside ``compute_state_update``; sibling
|
||||||
keeps going.
|
pairs continue independently because each call is wrapped in its
|
||||||
|
own try/except inside ``compute_state_update``.
|
||||||
"""
|
"""
|
||||||
out: list[tuple[str, str, StateUpdate]] = []
|
pair_keys: list[tuple[str, str]] = [
|
||||||
for src in present_ids:
|
(src, tgt)
|
||||||
for tgt in present_ids:
|
for src in present_ids
|
||||||
if src == tgt:
|
for tgt in present_ids
|
||||||
continue
|
if src != tgt
|
||||||
edge = prior_edges.get((src, tgt), {})
|
]
|
||||||
update = await compute_state_update(
|
if not pair_keys:
|
||||||
client,
|
return []
|
||||||
model=classifier_model,
|
|
||||||
source_id=src,
|
async def _one(src: str, tgt: str) -> StateUpdate:
|
||||||
target_id=tgt,
|
edge = prior_edges.get((src, tgt), {})
|
||||||
source_name=present_names.get(src, src),
|
return await compute_state_update(
|
||||||
source_persona=personas.get(src, "") or "",
|
client,
|
||||||
target_name=present_names.get(tgt, tgt),
|
model=classifier_model,
|
||||||
prior_affinity=int(edge.get("affinity", 50)),
|
source_id=src,
|
||||||
prior_trust=int(edge.get("trust", 50)),
|
target_id=tgt,
|
||||||
prior_summary=edge.get("summary", "") or "",
|
source_name=present_names.get(src, src),
|
||||||
recent_dialogue=recent_dialogue,
|
source_persona=personas.get(src, "") or "",
|
||||||
timeout_s=timeout_s,
|
target_name=present_names.get(tgt, tgt),
|
||||||
)
|
prior_affinity=int(edge.get("affinity", 50)),
|
||||||
out.append((src, tgt, update))
|
prior_trust=int(edge.get("trust", 50)),
|
||||||
return out
|
prior_summary=edge.get("summary", "") or "",
|
||||||
|
recent_dialogue=recent_dialogue,
|
||||||
|
timeout_s=timeout_s,
|
||||||
|
)
|
||||||
|
|
||||||
|
updates = await asyncio.gather(*(_one(src, tgt) for src, tgt in pair_keys))
|
||||||
|
return [(src, tgt, upd) for (src, tgt), upd in zip(pair_keys, updates)]
|
||||||
|
|
||||||
|
|
||||||
__all__ = ["compute_state_updates_for_present"]
|
__all__ = ["compute_state_updates_for_present"]
|
||||||
|
|||||||
+51
-6
@@ -325,14 +325,59 @@ def _build_open_threads_block(threads: list[dict]) -> str | None:
|
|||||||
return "\n".join(lines)
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def trim_to_max_beats(text: str, max_beats: int = 3) -> str:
|
||||||
|
"""Truncate ``text`` to at most ``max_beats`` asterisk-action beats.
|
||||||
|
|
||||||
|
A "beat" is one ``*action*`` markdown-italic block plus the dialogue
|
||||||
|
that follows it; counting ``*`` characters works as a deterministic
|
||||||
|
boundary detector since each complete beat contributes exactly two
|
||||||
|
asterisks (open + close). The (2*max_beats + 1)th asterisk is the
|
||||||
|
opening of an over-the-cap beat; we trim immediately before it and
|
||||||
|
strip trailing whitespace.
|
||||||
|
|
||||||
|
Belt-and-suspenders for verbose roleplay-tuned narrators (Cydonia,
|
||||||
|
Magnum, etc.) that reliably ignore "HARD CAP: 2-3 beats" prompt
|
||||||
|
instructions and keep going. A physical max_tokens cap helps but
|
||||||
|
truncates mid-word; this trims at a beat boundary instead.
|
||||||
|
|
||||||
|
Idempotent and safe on outputs with fewer beats than the cap (just
|
||||||
|
returns the text unchanged after a single pass).
|
||||||
|
"""
|
||||||
|
if max_beats <= 0:
|
||||||
|
return ""
|
||||||
|
target = max_beats * 2
|
||||||
|
count = 0
|
||||||
|
for i, ch in enumerate(text):
|
||||||
|
if ch == "*":
|
||||||
|
count += 1
|
||||||
|
if count > target:
|
||||||
|
return text[:i].rstrip()
|
||||||
|
return text
|
||||||
|
|
||||||
|
|
||||||
def _closing_instruction(speaker_name: str, addressee_name: str) -> str:
|
def _closing_instruction(speaker_name: str, addressee_name: str) -> str:
|
||||||
return (
|
return (
|
||||||
f"Continue the scene as {speaker_name}, in their voice, responding "
|
f"Continue as {speaker_name}. Format strictly:\n"
|
||||||
"naturally. Use *asterisks* for actions and quotes for dialogue. "
|
f"- Wrap actions and gestures in *asterisks*, third person "
|
||||||
f"Stay in character. Do not narrate {addressee_name}'s actions or "
|
f"({speaker_name}/she/he/they) — never first person, never inner "
|
||||||
"thoughts. "
|
"thoughts inside asterisks.\n"
|
||||||
"Keep your response to a single beat — one or two short paragraphs "
|
"- Speak dialogue as plain text between action beats, no quote "
|
||||||
"at most. Don't monologue; leave room for the other person to react."
|
"marks. Keep speech fragmented, not paragraphs.\n"
|
||||||
|
"- HARD CAP: 2-3 beats per response. A beat is one *asterisk "
|
||||||
|
"action* paired with a short dialogue fragment. After the "
|
||||||
|
"third beat, STOP — do not add a fourth, do not summarize, do "
|
||||||
|
f"not narrate {addressee_name}'s reaction. Long responses break "
|
||||||
|
"the scene's rhythm.\n"
|
||||||
|
"- Each beat is one concrete gesture or sensory image. No "
|
||||||
|
"explanation, no inner monologue, no stage-direction adverbs.\n"
|
||||||
|
"- Trailing ellipses (...) are fine for emotional weight.\n"
|
||||||
|
"EXAMPLE (3 beats, stops cleanly):\n"
|
||||||
|
"*She turns with soapy hands to cup your face* That's how I know "
|
||||||
|
"it's real... *She kisses you softly* You love me when I'm messy... "
|
||||||
|
"*She smiles tearfully* ...and every moment in between.\n"
|
||||||
|
f"Show only what {addressee_name} could externally observe of "
|
||||||
|
f"{speaker_name}; never narrate {addressee_name}'s actions, "
|
||||||
|
"thoughts, or speech. One response — leave room to react."
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
+130
-36
@@ -95,6 +95,27 @@ from chat.web.render import render_turn_html
|
|||||||
_log = logging.getLogger(__name__)
|
_log = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# T114.3: map a lifecycle-transition event kind to the events-table
|
||||||
|
# status it implicitly transitioned *from*. Regenerate uses this to pick
|
||||||
|
# the ``prior_status`` value for the ``event_status_reverted`` rollback
|
||||||
|
# event so the projector sets the row back to where it was before the
|
||||||
|
# superseded turn fired the transition.
|
||||||
|
#
|
||||||
|
# - ``event_started`` was emitted when the row was 'planned' → revert to
|
||||||
|
# 'planned'.
|
||||||
|
# - ``event_completed`` was emitted when the row was 'active' → revert
|
||||||
|
# to 'active'.
|
||||||
|
# - ``event_cancelled`` could have fired from either 'planned' or
|
||||||
|
# 'active'. Best-effort default: 'active'. The forward transitions
|
||||||
|
# below only fire detect_event_transitions for currently-active rows,
|
||||||
|
# so 'active' is the realistic prior in practice.
|
||||||
|
_PRIOR_STATUS_MAP: dict[str, str] = {
|
||||||
|
"event_started": "planned",
|
||||||
|
"event_completed": "active",
|
||||||
|
"event_cancelled": "active",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
async def regenerate_assistant_turn(
|
async def regenerate_assistant_turn(
|
||||||
conn: Connection,
|
conn: Connection,
|
||||||
client,
|
client,
|
||||||
@@ -115,17 +136,18 @@ async def regenerate_assistant_turn(
|
|||||||
cannot be found — the FastAPI route translates this to 404.
|
cannot be found — the FastAPI route translates this to 404.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
**Lifecycle-rollback limitation (T83.4, Phase 4 follow-up).**
|
**Lifecycle rollback (T114, Phase 4.5).**
|
||||||
When the superseded turn already produced lifecycle transitions
|
When the superseded turn already produced lifecycle transitions
|
||||||
(``event_started`` / ``event_completed`` / ``event_cancelled``),
|
(``event_started`` / ``event_completed`` / ``event_cancelled``),
|
||||||
this function does NOT roll those rows back before re-running
|
this function emits an ``event_status_reverted`` event for each
|
||||||
``detect_event_transitions`` against the regenerated text. A
|
so the events row's status returns to its prior value before the
|
||||||
regenerate-after-completion can therefore double-emit promotion
|
regenerated narrative is reclassified. Backward compatibility:
|
||||||
artifacts if the new text re-completes the same event. Phase 3.5
|
lifecycle events authored before T114.1 lack the
|
||||||
only documents the gap and emits a WARNING log naming the
|
``triggered_by_assistant_turn_id`` payload field; rollback skips
|
||||||
affected event_log ids; the actual undo pass is invasive
|
those (logged at DEBUG) so historic rows are not retroactively
|
||||||
(re-projection / inverse-handler dispatch) and is deferred to
|
reverted. A WARNING about un-rolled-back transitions is still
|
||||||
Phase 4. See the ``# T83.4`` block below for the warning emit.
|
emitted when stragglers are found — the rollback handles the
|
||||||
|
common case while older logs continue to need manual review.
|
||||||
"""
|
"""
|
||||||
chat = get_chat(conn, chat_id)
|
chat = get_chat(conn, chat_id)
|
||||||
if chat is None:
|
if chat is None:
|
||||||
@@ -158,20 +180,21 @@ async def regenerate_assistant_turn(
|
|||||||
original_assistant_payload = json.loads(row[0])
|
original_assistant_payload = json.loads(row[0])
|
||||||
original_user_turn_id = original_assistant_payload.get("user_turn_id")
|
original_user_turn_id = original_assistant_payload.get("user_turn_id")
|
||||||
|
|
||||||
# T83.4: scan for downstream lifecycle transitions emitted by the
|
# T114.3: roll back lifecycle transitions emitted by the superseded
|
||||||
# superseded turn — they're not being rolled back (see method
|
# turn. The scan uses the same id-greater-than-superseded-turn
|
||||||
# docstring). Heuristic: any ``event_started`` / ``event_completed``
|
# heuristic as the legacy T83.4 warning, joined to ``events`` for
|
||||||
# / ``event_cancelled`` event_log row with id strictly greater than
|
# chat scoping (lifecycle events don't carry chat_id in their
|
||||||
# the original assistant_turn's id was emitted as part of (or after)
|
# payload — they reference an ``event_id`` FK to the ``events``
|
||||||
# that turn's processing. Lifecycle events don't carry ``chat_id``
|
# table, which holds chat_id). For each row whose payload carries
|
||||||
# in their payload (their payload references an ``event_id`` FK to
|
# ``triggered_by_assistant_turn_id == original_assistant_event_id``
|
||||||
# the ``events`` table, which holds chat_id), so we join through
|
# (T114.1 back-reference), emit an ``event_status_reverted`` event
|
||||||
# ``events`` to scope to this chat.
|
# so the events-row status returns to the pre-transition value.
|
||||||
#
|
# Lifecycle rows authored before T114.1 lack the back-reference;
|
||||||
# A WARNING log surfaces the affected event ids so operators can
|
# those are skipped (DEBUG log) and a WARNING tracks their count so
|
||||||
# spot double-emit cases until the Phase 4 rollback pass lands.
|
# operators still see legacy stragglers — preserves the T83.4
|
||||||
|
# observability contract for un-rolled-back transitions.
|
||||||
unrolled_lifecycle = conn.execute(
|
unrolled_lifecycle = conn.execute(
|
||||||
"SELECT el.id, el.kind FROM event_log AS el "
|
"SELECT el.id, el.kind, el.payload_json FROM event_log AS el "
|
||||||
"JOIN events AS ev "
|
"JOIN events AS ev "
|
||||||
" ON ev.event_id = json_extract(el.payload_json, '$.event_id') "
|
" ON ev.event_id = json_extract(el.payload_json, '$.event_id') "
|
||||||
"WHERE el.kind IN ("
|
"WHERE el.kind IN ("
|
||||||
@@ -182,18 +205,73 @@ async def regenerate_assistant_turn(
|
|||||||
"ORDER BY el.id ASC",
|
"ORDER BY el.id ASC",
|
||||||
(chat_id, original_assistant_event_id),
|
(chat_id, original_assistant_event_id),
|
||||||
).fetchall()
|
).fetchall()
|
||||||
if unrolled_lifecycle:
|
rolled_back_ids: list[int] = []
|
||||||
# T90.2: phrased as "at-or-after turn <id>" rather than "from
|
skipped_no_backref: list[int] = []
|
||||||
# superseded turn" because regenerating an OLDER turn lists
|
for el_id, el_kind, el_payload_json in unrolled_lifecycle:
|
||||||
# intervening-turn transitions that legitimately stand on their
|
try:
|
||||||
# own — those weren't authored by the superseded turn itself.
|
lifecycle_payload = json.loads(el_payload_json)
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
skipped_no_backref.append(el_id)
|
||||||
|
continue
|
||||||
|
triggered_by = lifecycle_payload.get("triggered_by_assistant_turn_id")
|
||||||
|
if triggered_by != original_assistant_event_id:
|
||||||
|
# Either a legacy row (no field) or a transition triggered
|
||||||
|
# by a *different* turn — leave it alone. DEBUG so the
|
||||||
|
# message is available under verbose logging without
|
||||||
|
# spamming the default WARNING channel.
|
||||||
|
_log.debug(
|
||||||
|
"regenerate_assistant_turn: skipping rollback for "
|
||||||
|
"lifecycle event_log id=%d (kind=%s) — no back-reference "
|
||||||
|
"or different turn (triggered_by=%r vs superseded=%d)",
|
||||||
|
el_id,
|
||||||
|
el_kind,
|
||||||
|
triggered_by,
|
||||||
|
original_assistant_event_id,
|
||||||
|
)
|
||||||
|
if triggered_by is None:
|
||||||
|
skipped_no_backref.append(el_id)
|
||||||
|
continue
|
||||||
|
prior_status = _PRIOR_STATUS_MAP.get(el_kind)
|
||||||
|
if prior_status is None:
|
||||||
|
# Defensive: the SQL filter already restricts to the three
|
||||||
|
# known kinds, but a future schema addition shouldn't crash
|
||||||
|
# the rollback path.
|
||||||
|
continue
|
||||||
|
target_event_id = lifecycle_payload.get("event_id")
|
||||||
|
if target_event_id is None:
|
||||||
|
continue
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="event_status_reverted",
|
||||||
|
payload={
|
||||||
|
"event_id": target_event_id,
|
||||||
|
"prior_status": prior_status,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
rolled_back_ids.append(el_id)
|
||||||
|
if rolled_back_ids:
|
||||||
|
_log.info(
|
||||||
|
"regenerate_assistant_turn: rolled back %d lifecycle "
|
||||||
|
"transition(s) triggered by superseded turn %s "
|
||||||
|
"(event_log ids: %s)",
|
||||||
|
len(rolled_back_ids),
|
||||||
|
original_assistant_event_id,
|
||||||
|
rolled_back_ids,
|
||||||
|
)
|
||||||
|
if skipped_no_backref:
|
||||||
|
# T83.4 (legacy) compatibility: still warn about stragglers
|
||||||
|
# without the back-reference so operators can spot pre-T114
|
||||||
|
# double-emit risks. Phrased as "at-or-after turn <id>" per
|
||||||
|
# T90.2 — older transitions may legitimately belong to other
|
||||||
|
# turns.
|
||||||
_log.warning(
|
_log.warning(
|
||||||
"regenerate_assistant_turn: %d lifecycle transition(s) "
|
"regenerate_assistant_turn: %d lifecycle transition(s) "
|
||||||
"at-or-after turn %s are NOT being rolled back (Phase 4 "
|
"at-or-after turn %s are NOT being rolled back (no "
|
||||||
"follow-up). Affected event ids: %s",
|
"triggered_by_assistant_turn_id back-reference). "
|
||||||
len(unrolled_lifecycle),
|
"Affected event ids: %s",
|
||||||
|
len(skipped_no_backref),
|
||||||
original_assistant_event_id,
|
original_assistant_event_id,
|
||||||
[r[0] for r in unrolled_lifecycle],
|
skipped_no_backref,
|
||||||
)
|
)
|
||||||
|
|
||||||
# 1a. Look up any sibling interjection beat in the same turn group
|
# 1a. Look up any sibling interjection beat in the same turn group
|
||||||
@@ -716,11 +794,13 @@ async def regenerate_assistant_turn(
|
|||||||
# runs inline after a completion so promotion artifacts land in the
|
# runs inline after a completion so promotion artifacts land in the
|
||||||
# same regenerate path.
|
# same regenerate path.
|
||||||
#
|
#
|
||||||
# T83.4 follow-up: when a regenerate replaces a turn that had
|
# T114.3: original-turn transitions emitted before this regenerate
|
||||||
# already produced event transitions, those original transitions
|
# ran were rolled back at the top of the function (see the
|
||||||
# are NOT undone here (Phase 4 work). A WARNING log earlier in this
|
# ``# T114.3`` block) by appending ``event_status_reverted`` for
|
||||||
# function names the affected event_log ids — see the T83.4 block
|
# each. The classify-and-emit pass below now operates against an
|
||||||
# near the function entry.
|
# ``events`` projection that has already been reverted, so it can
|
||||||
|
# safely re-fire transitions for the regenerated narrative without
|
||||||
|
# double-emitting promotion artifacts.
|
||||||
new_active_events = list_active_events(conn, chat_id)
|
new_active_events = list_active_events(conn, chat_id)
|
||||||
if new_active_events:
|
if new_active_events:
|
||||||
lifecycle_decision = await detect_event_transitions(
|
lifecycle_decision = await detect_event_transitions(
|
||||||
@@ -738,6 +818,12 @@ async def regenerate_assistant_turn(
|
|||||||
payload={
|
payload={
|
||||||
"event_id": transition.event_id,
|
"event_id": transition.event_id,
|
||||||
"started_at": chat.get("time"),
|
"started_at": chat.get("time"),
|
||||||
|
# T114.1: back-reference to the assistant_turn
|
||||||
|
# that triggered this transition (see turns.py
|
||||||
|
# for rationale).
|
||||||
|
"triggered_by_assistant_turn_id": (
|
||||||
|
new_assistant_event_id
|
||||||
|
),
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
elif transition.new_status == "completed":
|
elif transition.new_status == "completed":
|
||||||
@@ -747,6 +833,10 @@ async def regenerate_assistant_turn(
|
|||||||
payload={
|
payload={
|
||||||
"event_id": transition.event_id,
|
"event_id": transition.event_id,
|
||||||
"completed_at": chat.get("time"),
|
"completed_at": chat.get("time"),
|
||||||
|
# T114.1: back-reference (see above).
|
||||||
|
"triggered_by_assistant_turn_id": (
|
||||||
|
new_assistant_event_id
|
||||||
|
),
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
promote_completed_event(
|
promote_completed_event(
|
||||||
@@ -762,6 +852,10 @@ async def regenerate_assistant_turn(
|
|||||||
payload={
|
payload={
|
||||||
"event_id": transition.event_id,
|
"event_id": transition.event_id,
|
||||||
"completed_at": chat.get("time"),
|
"completed_at": chat.get("time"),
|
||||||
|
# T114.1: back-reference (see above).
|
||||||
|
"triggered_by_assistant_turn_id": (
|
||||||
|
new_assistant_event_id
|
||||||
|
),
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|||||||
@@ -144,23 +144,36 @@ def _read_recent_dialogue(
|
|||||||
``id >= since_event_id`` so callers needing a scene-scoped view (e.g.
|
``id >= since_event_id`` so callers needing a scene-scoped view (e.g.
|
||||||
thread detection on close) don't pull turns that landed before the
|
thread detection on close) don't pull turns that landed before the
|
||||||
closing scene's ``scene_opened`` event.
|
closing scene's ``scene_opened`` event.
|
||||||
|
|
||||||
|
T113: also clamps by the active branch's ``[origin, head]`` event-id
|
||||||
|
range so scene-summary inputs respect the user's current branch.
|
||||||
|
Bootstrap-main and "no active branch" fall through to ``(0, BIG_INT)``
|
||||||
|
so existing flows are unchanged.
|
||||||
"""
|
"""
|
||||||
|
from chat.state.branches import active_branch_event_ids
|
||||||
|
|
||||||
|
origin, head = active_branch_event_ids(conn)
|
||||||
if since_event_id is None:
|
if since_event_id is None:
|
||||||
cur = conn.execute(
|
cur = conn.execute(
|
||||||
"SELECT kind, payload_json FROM event_log "
|
"SELECT kind, payload_json FROM event_log "
|
||||||
"WHERE kind IN ('user_turn', 'assistant_turn') "
|
"WHERE kind IN ('user_turn', 'assistant_turn') "
|
||||||
" AND superseded_by IS NULL AND hidden = 0 "
|
" AND superseded_by IS NULL AND hidden = 0 "
|
||||||
|
" AND id BETWEEN ? AND ? "
|
||||||
"ORDER BY id DESC LIMIT ?",
|
"ORDER BY id DESC LIMIT ?",
|
||||||
(limit,),
|
(origin, head, limit),
|
||||||
)
|
)
|
||||||
else:
|
else:
|
||||||
|
# Compose ``since_event_id`` with the branch lower bound — readers
|
||||||
|
# want the tightest ``id >= max(since, origin)`` clamp without an
|
||||||
|
# extra Python pass.
|
||||||
|
lower = max(origin, since_event_id)
|
||||||
cur = conn.execute(
|
cur = conn.execute(
|
||||||
"SELECT kind, payload_json FROM event_log "
|
"SELECT kind, payload_json FROM event_log "
|
||||||
"WHERE kind IN ('user_turn', 'assistant_turn') "
|
"WHERE kind IN ('user_turn', 'assistant_turn') "
|
||||||
" AND superseded_by IS NULL AND hidden = 0 "
|
" AND superseded_by IS NULL AND hidden = 0 "
|
||||||
" AND id >= ? "
|
" AND id BETWEEN ? AND ? "
|
||||||
"ORDER BY id DESC LIMIT ?",
|
"ORDER BY id DESC LIMIT ?",
|
||||||
(since_event_id, limit),
|
(lower, head, limit),
|
||||||
)
|
)
|
||||||
rows = list(reversed(cur.fetchall()))
|
rows = list(reversed(cur.fetchall()))
|
||||||
out: list[dict] = []
|
out: list[dict] = []
|
||||||
|
|||||||
@@ -30,6 +30,7 @@ from __future__ import annotations
|
|||||||
import json
|
import json
|
||||||
from sqlite3 import Connection
|
from sqlite3 import Connection
|
||||||
|
|
||||||
|
from chat.state.branches import active_branch_event_ids
|
||||||
from chat.state.edges import get_edge
|
from chat.state.edges import get_edge
|
||||||
|
|
||||||
|
|
||||||
@@ -60,15 +61,22 @@ def read_recent_dialogue(
|
|||||||
previous implementation filtered chat_id post-fetch in Python, which
|
previous implementation filtered chat_id post-fetch in Python, which
|
||||||
let foreign-chat rows fill the LIMIT and yield fewer than N relevant
|
let foreign-chat rows fill the LIMIT and yield fewer than N relevant
|
||||||
rows in busy multi-chat databases.
|
rows in busy multi-chat databases.
|
||||||
|
|
||||||
|
T113: clamp by the active branch's ``[origin, head]`` event-id range so
|
||||||
|
switching branches actually changes what dialogue this read sees.
|
||||||
|
Bootstrap-main and "no active branch" both fall through to ``(0,
|
||||||
|
BIG_INT)`` — no functional change for the metadata-only Phase 4 era.
|
||||||
"""
|
"""
|
||||||
|
origin, head = active_branch_event_ids(conn)
|
||||||
if exclude_event_id is None:
|
if exclude_event_id is None:
|
||||||
cur = conn.execute(
|
cur = conn.execute(
|
||||||
"SELECT id, kind, payload_json FROM event_log "
|
"SELECT id, kind, payload_json FROM event_log "
|
||||||
"WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
|
"WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
|
||||||
" AND superseded_by IS NULL AND hidden = 0 "
|
" AND superseded_by IS NULL AND hidden = 0 "
|
||||||
|
" AND id BETWEEN ? AND ? "
|
||||||
" AND json_extract(payload_json, '$.chat_id') = ? "
|
" AND json_extract(payload_json, '$.chat_id') = ? "
|
||||||
"ORDER BY id DESC LIMIT ?",
|
"ORDER BY id DESC LIMIT ?",
|
||||||
(chat_id, limit),
|
(origin, head, chat_id, limit),
|
||||||
)
|
)
|
||||||
else:
|
else:
|
||||||
cur = conn.execute(
|
cur = conn.execute(
|
||||||
@@ -76,9 +84,10 @@ def read_recent_dialogue(
|
|||||||
"WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
|
"WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
|
||||||
" AND id != ? "
|
" AND id != ? "
|
||||||
" AND superseded_by IS NULL AND hidden = 0 "
|
" AND superseded_by IS NULL AND hidden = 0 "
|
||||||
|
" AND id BETWEEN ? AND ? "
|
||||||
" AND json_extract(payload_json, '$.chat_id') = ? "
|
" AND json_extract(payload_json, '$.chat_id') = ? "
|
||||||
"ORDER BY id DESC LIMIT ?",
|
"ORDER BY id DESC LIMIT ?",
|
||||||
(exclude_event_id, chat_id, limit),
|
(exclude_event_id, origin, head, chat_id, limit),
|
||||||
)
|
)
|
||||||
rows = list(reversed(cur.fetchall()))
|
rows = list(reversed(cur.fetchall()))
|
||||||
out: list[dict] = []
|
out: list[dict] = []
|
||||||
|
|||||||
@@ -107,13 +107,23 @@ async def parse_turn(
|
|||||||
without an LLM call (the classifier would error on empty input
|
without an LLM call (the classifier would error on empty input
|
||||||
anyway, and the result is unambiguous).
|
anyway, and the result is unambiguous).
|
||||||
|
|
||||||
Raises ``RuntimeError`` if the classifier fails twice — no default
|
Falls back to a single dialogue-shaped segment containing the
|
||||||
is supplied, since the caller (T19's turn flow) is responsible for
|
whole prose if the classifier flaps after retries — the turn flow
|
||||||
surfacing the error to the user.
|
can keep moving (the narrative will still fire on the prose) at
|
||||||
|
the cost of finer-grained segment classification. The original
|
||||||
|
code raised ``RuntimeError`` here, which 500'd the whole request
|
||||||
|
and was particularly painful in multi-bot scenes where every
|
||||||
|
user turn paid the classifier round-trip.
|
||||||
"""
|
"""
|
||||||
if not prose.strip():
|
if not prose.strip():
|
||||||
return ParsedTurn(segments=[])
|
return ParsedTurn(segments=[])
|
||||||
|
|
||||||
|
fallback = ParsedTurn(
|
||||||
|
segments=[TurnSegment(kind="dialogue", text=prose)],
|
||||||
|
intent="narrative",
|
||||||
|
landing_state_hint="",
|
||||||
|
)
|
||||||
|
|
||||||
user_prompt = f"INPUT:\n{prose}"
|
user_prompt = f"INPUT:\n{prose}"
|
||||||
return await classify(
|
return await classify(
|
||||||
client,
|
client,
|
||||||
@@ -121,5 +131,6 @@ async def parse_turn(
|
|||||||
system=_SYSTEM_PROMPT,
|
system=_SYSTEM_PROMPT,
|
||||||
user=user_prompt,
|
user=user_prompt,
|
||||||
schema=ParsedTurn,
|
schema=ParsedTurn,
|
||||||
|
default=fallback,
|
||||||
timeout_s=timeout_s,
|
timeout_s=timeout_s,
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -9,11 +9,15 @@ existing event readers remain branch-agnostic.
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
from sqlite3 import Connection
|
from sqlite3 import Connection
|
||||||
|
|
||||||
from chat.eventlog.projector import on
|
from chat.eventlog.projector import on
|
||||||
from chat.eventlog.log import Event
|
from chat.eventlog.log import Event
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
@on("branch_created")
|
@on("branch_created")
|
||||||
def _apply_branch_created(conn: Connection, e: Event) -> None:
|
def _apply_branch_created(conn: Connection, e: Event) -> None:
|
||||||
@@ -37,9 +41,26 @@ def _apply_branch_switched(conn: Connection, e: Event) -> None:
|
|||||||
"""Set is_active=1 on the named branch and is_active=0 on all others.
|
"""Set is_active=1 on the named branch and is_active=0 on all others.
|
||||||
|
|
||||||
Atomic via two UPDATEs ordered to avoid the unique-active-index race.
|
Atomic via two UPDATEs ordered to avoid the unique-active-index race.
|
||||||
|
|
||||||
|
If the named branch does not exist, a warning is emitted and the
|
||||||
|
is_active flags are still cleared (preserving prior behavior — the
|
||||||
|
second UPDATE simply matches no rows). Callers should validate the
|
||||||
|
name upstream; this guard surfaces accidental mismatches in the log.
|
||||||
"""
|
"""
|
||||||
p = e.payload
|
p = e.payload
|
||||||
name = p["name"]
|
name = p["name"]
|
||||||
|
# Warn (don't raise) if the target branch is missing. The existing
|
||||||
|
# outcome — zero active branches — is preserved; this just makes the
|
||||||
|
# condition observable instead of silent.
|
||||||
|
exists = conn.execute(
|
||||||
|
"SELECT 1 FROM branches WHERE name = ? LIMIT 1",
|
||||||
|
(name,),
|
||||||
|
).fetchone()
|
||||||
|
if exists is None:
|
||||||
|
logger.warning(
|
||||||
|
"branch_switched to unknown branch name %r; no branch will be active",
|
||||||
|
name,
|
||||||
|
)
|
||||||
# Clear ALL is_active flags first (avoids the unique-index trip).
|
# Clear ALL is_active flags first (avoids the unique-index trip).
|
||||||
conn.execute("UPDATE branches SET is_active = 0 WHERE is_active = 1")
|
conn.execute("UPDATE branches SET is_active = 0 WHERE is_active = 1")
|
||||||
conn.execute(
|
conn.execute(
|
||||||
@@ -79,6 +100,16 @@ def get_branch(conn: Connection, name: str) -> dict | None:
|
|||||||
|
|
||||||
|
|
||||||
def list_branches(conn: Connection, chat_id: str | None = None) -> list[dict]:
|
def list_branches(conn: Connection, chat_id: str | None = None) -> list[dict]:
|
||||||
|
"""Return branch rows, optionally scoped to a chat.
|
||||||
|
|
||||||
|
When ``chat_id`` is provided the filter is ``chat_id = ? OR chat_id IS NULL``,
|
||||||
|
so global (null-chat) branches are returned in *every* per-chat scope. This
|
||||||
|
is intentional: the bootstrapped ``"main"`` branch (and any future
|
||||||
|
null-chat branches) are global by design — they belong to no single chat
|
||||||
|
and should appear alongside per-chat branches in any chat-scoped listing.
|
||||||
|
Callers that want only per-chat branches should filter the result on
|
||||||
|
``chat_id is not None``.
|
||||||
|
"""
|
||||||
if chat_id is None:
|
if chat_id is None:
|
||||||
rows = conn.execute(
|
rows = conn.execute(
|
||||||
"SELECT id, name, origin_event_id, head_event_id, chat_id, "
|
"SELECT id, name, origin_event_id, head_event_id, chat_id, "
|
||||||
@@ -126,8 +157,58 @@ def active_branch(conn: Connection) -> dict | None:
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# T113: sentinel "no upper bound" used by ``active_branch_event_ids`` when the
|
||||||
|
# active branch's head is unset (the bootstrap "main" branch with origin=0 +
|
||||||
|
# head=0). Readers compose ``id BETWEEN origin AND head`` so a value larger
|
||||||
|
# than any possible row id behaves as "no clamp" without needing a separate
|
||||||
|
# code path. ``2**63 - 1`` is SQLite's max signed-int — safe forever.
|
||||||
|
_NO_HEAD_CLAMP = 2**63 - 1
|
||||||
|
|
||||||
|
|
||||||
|
def active_branch_event_ids(conn: Connection) -> tuple[int, int]:
|
||||||
|
"""Return ``(origin_event_id, head_event_id)`` for the currently active
|
||||||
|
branch, suitable as bounds for an ``event_log.id BETWEEN ? AND ?`` clamp
|
||||||
|
on user-facing reads (T113).
|
||||||
|
|
||||||
|
Defensive defaults:
|
||||||
|
|
||||||
|
* **No active branch row** (``active_branch`` returns ``None``) — return
|
||||||
|
``(0, _NO_HEAD_CLAMP)`` so readers see all events. This preserves the
|
||||||
|
Phase 4 "branches are metadata-only" contract for any code path that
|
||||||
|
somehow runs without the migration-0013 bootstrap.
|
||||||
|
* **Bootstrap "main"** — the canonical ``name="main", origin=0, head=0``
|
||||||
|
row inserted by migration 0013. Production today never emits
|
||||||
|
``branch_head_updated`` for main, so head stays at 0 even as events
|
||||||
|
accumulate. We treat this exact bootstrap state as "no clamp" and
|
||||||
|
return ``(0, _NO_HEAD_CLAMP)`` so all events remain visible. This is
|
||||||
|
what every existing test (which never configures branches) relies on.
|
||||||
|
* **Any other branch** — return the literal ``(origin, head)`` from the
|
||||||
|
branch row. A branch created at origin=N has head=N initially (per
|
||||||
|
``branch_from_event``), so ``BETWEEN N AND N`` returns just that one
|
||||||
|
seed event until the head is bumped via ``branch_head_updated``.
|
||||||
|
|
||||||
|
Note on the schema mismatch with the T113 spec: the spec describes
|
||||||
|
``head_event_id`` as nullable, but migration 0013 declared it
|
||||||
|
``NOT NULL DEFAULT 0``. We read head=0 on bootstrap main as the
|
||||||
|
"unset" sentinel; non-main branches never reach head=0 in normal
|
||||||
|
flow (creation sets head=origin, and origin=0 only for main).
|
||||||
|
"""
|
||||||
|
branch = active_branch(conn)
|
||||||
|
if branch is None:
|
||||||
|
return (0, _NO_HEAD_CLAMP)
|
||||||
|
origin = int(branch.get("origin_event_id") or 0)
|
||||||
|
head = int(branch.get("head_event_id") or 0)
|
||||||
|
# Bootstrap "main" sentinel — see docstring above. Detect by name +
|
||||||
|
# both ids being 0 to avoid mis-firing on a hypothetical future
|
||||||
|
# branch that legitimately starts at origin=0.
|
||||||
|
if branch.get("name") == "main" and origin == 0 and head == 0:
|
||||||
|
return (0, _NO_HEAD_CLAMP)
|
||||||
|
return (origin, head)
|
||||||
|
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
"get_branch",
|
"get_branch",
|
||||||
"list_branches",
|
"list_branches",
|
||||||
"active_branch",
|
"active_branch",
|
||||||
|
"active_branch_event_ids",
|
||||||
]
|
]
|
||||||
|
|||||||
@@ -67,6 +67,29 @@ def _apply_event_expired(conn: Connection, e: Event) -> None:
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@on("event_status_reverted")
|
||||||
|
def _apply_event_status_reverted(conn: Connection, e: Event) -> None:
|
||||||
|
"""T114.2: Revert an event row's status to ``prior_status``.
|
||||||
|
|
||||||
|
Emitted by ``regenerate_assistant_turn`` when a superseded turn had
|
||||||
|
triggered a lifecycle transition (event_started / event_completed /
|
||||||
|
event_cancelled). The rollback step needs an inverse projection that
|
||||||
|
sets the row's status back to whatever it was *before* the now-
|
||||||
|
superseded transition fired.
|
||||||
|
|
||||||
|
Unlike the forward transitions (which guard against terminal-status
|
||||||
|
overwrites) this handler is unconditional — the entire purpose is to
|
||||||
|
reverse a transition, including reverting from a terminal status
|
||||||
|
(completed/cancelled) back to a non-terminal one.
|
||||||
|
"""
|
||||||
|
p = e.payload
|
||||||
|
conn.execute(
|
||||||
|
"UPDATE events SET status = ?, updated_at = datetime('now') "
|
||||||
|
"WHERE event_id = ?",
|
||||||
|
(p["prior_status"], p["event_id"]),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def get_event(conn: Connection, event_id: str) -> dict | None:
|
def get_event(conn: Connection, event_id: str) -> dict | None:
|
||||||
row = conn.execute(
|
row = conn.execute(
|
||||||
"SELECT event_id, chat_id, kind, status, props_json, planned_for, "
|
"SELECT event_id, chat_id, kind, status, props_json, planned_for, "
|
||||||
|
|||||||
+71
-11
@@ -13,13 +13,18 @@ def _row_to_dict(conn: Connection, row: tuple) -> dict:
|
|||||||
|
|
||||||
@on("memory_written")
|
@on("memory_written")
|
||||||
def _apply_memory_written(conn: Connection, e: Event) -> None:
|
def _apply_memory_written(conn: Connection, e: Event) -> None:
|
||||||
|
# T109 (schema 0014): persist the projecting event's id on the memory
|
||||||
|
# row so cross-chat search results can deep-link back to the
|
||||||
|
# originating turn (T111). Older memory rows projected before 0014
|
||||||
|
# ran read NULL here — the column is nullable for that reason.
|
||||||
p = e.payload
|
p = e.payload
|
||||||
conn.execute(
|
conn.execute(
|
||||||
"INSERT INTO memories ("
|
"INSERT INTO memories ("
|
||||||
"owner_id, chat_id, scene_id, pov_summary, "
|
"owner_id, chat_id, scene_id, pov_summary, "
|
||||||
"witness_you, witness_host, witness_guest, "
|
"witness_you, witness_host, witness_guest, "
|
||||||
"chat_clock_at, source, reliability, significance, pinned, auto_pinned"
|
"chat_clock_at, source, reliability, significance, pinned, auto_pinned, "
|
||||||
") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
|
"event_id"
|
||||||
|
") VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
|
||||||
(
|
(
|
||||||
p["owner_id"],
|
p["owner_id"],
|
||||||
p["chat_id"],
|
p["chat_id"],
|
||||||
@@ -34,6 +39,7 @@ def _apply_memory_written(conn: Connection, e: Event) -> None:
|
|||||||
int(p.get("significance", 1)),
|
int(p.get("significance", 1)),
|
||||||
int(p.get("pinned", 0)),
|
int(p.get("pinned", 0)),
|
||||||
int(p.get("auto_pinned", 0)),
|
int(p.get("auto_pinned", 0)),
|
||||||
|
e.id,
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -112,6 +118,25 @@ SIGNIFICANCE_RANK_BIAS = 0.5
|
|||||||
RRF_CONST = 60
|
RRF_CONST = 60
|
||||||
|
|
||||||
|
|
||||||
|
def _max_event_id(conn: Connection, owner_id: str) -> int:
|
||||||
|
"""Return the largest ``memories.id`` for ``owner_id`` (1 if none exist).
|
||||||
|
|
||||||
|
Used as the recency-boost denominator by both ``_composite_rerank`` and
|
||||||
|
``_rrf_fuse_and_rerank`` (T104). The row id is a monotonic recency proxy
|
||||||
|
— newer memories have larger ids — so dividing by the per-owner max keeps
|
||||||
|
the boost in [0, 1] regardless of how many memories the owner has.
|
||||||
|
|
||||||
|
Returns 1 (not 0) when the owner has no rows so callers can divide by
|
||||||
|
the result without a guard. The "no memories" case never actually hits
|
||||||
|
this helper because the FTS query above would have returned no rows,
|
||||||
|
but the safe default keeps the helper trivially reusable.
|
||||||
|
"""
|
||||||
|
row = conn.execute(
|
||||||
|
"SELECT MAX(id) FROM memories WHERE owner_id = ?", (owner_id,)
|
||||||
|
).fetchone()
|
||||||
|
return row[0] if row and row[0] else 1
|
||||||
|
|
||||||
|
|
||||||
def search_memories(
|
def search_memories(
|
||||||
conn: Connection,
|
conn: Connection,
|
||||||
owner_id: str,
|
owner_id: str,
|
||||||
@@ -163,6 +188,14 @@ def search_memories(
|
|||||||
|
|
||||||
When ``query_vector`` is None: FTS-only behaviour unchanged — all
|
When ``query_vector`` is None: FTS-only behaviour unchanged — all
|
||||||
Phase 1-3.5 callers see the same row shape and ordering as before.
|
Phase 1-3.5 callers see the same row shape and ordering as before.
|
||||||
|
|
||||||
|
**Row-shape contract (T104):** every returned dict carries an
|
||||||
|
``fts_rank`` key. For FTS hits this is the BM25 score (a negative float,
|
||||||
|
lower-is-better). For *vector-only* hits surfaced by the fused path —
|
||||||
|
rows that matched the query embedding but did NOT match FTS — the
|
||||||
|
``fts_rank`` value is ``None``. Downstream consumers must accept
|
||||||
|
``None`` here; do not assume ``fts_rank`` is always numeric. The
|
||||||
|
``composite_score`` is always a float on every returned row.
|
||||||
"""
|
"""
|
||||||
if witness_role not in _VALID_WITNESS_ROLES:
|
if witness_role not in _VALID_WITNESS_ROLES:
|
||||||
raise ValueError(
|
raise ValueError(
|
||||||
@@ -180,12 +213,20 @@ def search_memories(
|
|||||||
# channel) so memories that are weak in FTS but strong in vector — and
|
# channel) so memories that are weak in FTS but strong in vector — and
|
||||||
# vice versa — make it into the merge pool.
|
# vice versa — make it into the merge pool.
|
||||||
over_fetch = max(k * 2, 20) if query_vector is not None else max(k * 4, 20)
|
over_fetch = max(k * 2, 20) if query_vector is not None else max(k * 4, 20)
|
||||||
|
# T113: branch-scope filter on ``m.event_id`` (T109's column). Memories
|
||||||
|
# whose ``event_id`` is NULL — projected before the 0014 schema migration
|
||||||
|
# ran — are *included* unconditionally so the branch filter never breaks
|
||||||
|
# legacy retrieval. Newer rows respect the active branch's bounds.
|
||||||
|
from chat.state.branches import active_branch_event_ids
|
||||||
|
|
||||||
|
origin, head = active_branch_event_ids(conn)
|
||||||
sql = (
|
sql = (
|
||||||
f"SELECT {select_list}, memories_fts.rank AS fts_rank "
|
f"SELECT {select_list}, memories_fts.rank AS fts_rank "
|
||||||
"FROM memories_fts "
|
"FROM memories_fts "
|
||||||
"JOIN memories m ON m.id = memories_fts.rowid "
|
"JOIN memories m ON m.id = memories_fts.rowid "
|
||||||
f"WHERE m.owner_id = ? AND m.{witness_col} = 1 "
|
f"WHERE m.owner_id = ? AND m.{witness_col} = 1 "
|
||||||
"AND memories_fts MATCH ? "
|
"AND memories_fts MATCH ? "
|
||||||
|
"AND (m.event_id IS NULL OR m.event_id BETWEEN ? AND ?) "
|
||||||
# T57: significance multiplier biases the FTS over-fetch order. BM25
|
# T57: significance multiplier biases the FTS over-fetch order. BM25
|
||||||
# ``rank`` is lower-is-better, so subtracting ``significance * BIAS``
|
# ``rank`` is lower-is-better, so subtracting ``significance * BIAS``
|
||||||
# surfaces higher-significance rows above lower-significance rows with
|
# surfaces higher-significance rows above lower-significance rows with
|
||||||
@@ -194,7 +235,10 @@ def search_memories(
|
|||||||
"ORDER BY (memories_fts.rank - m.significance * ?) ASC "
|
"ORDER BY (memories_fts.rank - m.significance * ?) ASC "
|
||||||
"LIMIT ?"
|
"LIMIT ?"
|
||||||
)
|
)
|
||||||
cur = conn.execute(sql, (owner_id, query, SIGNIFICANCE_RANK_BIAS, over_fetch))
|
cur = conn.execute(
|
||||||
|
sql,
|
||||||
|
(owner_id, query, origin, head, SIGNIFICANCE_RANK_BIAS, over_fetch),
|
||||||
|
)
|
||||||
rows = cur.fetchall()
|
rows = cur.fetchall()
|
||||||
|
|
||||||
# FTS-only path: preserve pre-T96 behaviour exactly.
|
# FTS-only path: preserve pre-T96 behaviour exactly.
|
||||||
@@ -227,10 +271,7 @@ def _composite_rerank(
|
|||||||
Extracted from ``search_memories`` so the no-vector path stays a single
|
Extracted from ``search_memories`` so the no-vector path stays a single
|
||||||
call and the fused path can re-use the same boost formulae after RRF.
|
call and the fused path can re-use the same boost formulae after RRF.
|
||||||
"""
|
"""
|
||||||
max_id_row = conn.execute(
|
max_id = _max_event_id(conn, owner_id)
|
||||||
"SELECT MAX(id) FROM memories WHERE owner_id = ?", (owner_id,)
|
|
||||||
).fetchone()
|
|
||||||
max_id = max_id_row[0] if max_id_row and max_id_row[0] else 1
|
|
||||||
|
|
||||||
result_cols = cols + ["fts_rank"]
|
result_cols = cols + ["fts_rank"]
|
||||||
enriched: list[dict] = []
|
enriched: list[dict] = []
|
||||||
@@ -301,6 +342,28 @@ def _rrf_fuse_and_rerank(
|
|||||||
query_vector=query_vector,
|
query_vector=query_vector,
|
||||||
k=vec_over_fetch,
|
k=vec_over_fetch,
|
||||||
)
|
)
|
||||||
|
# T113: drop vector hits that fall outside the active branch's event-id
|
||||||
|
# range. ``vector_search`` is a generic service used elsewhere; the
|
||||||
|
# branch filter applied to the FTS leg also has to apply here so the
|
||||||
|
# fused result respects the same scope. Memories with NULL event_id
|
||||||
|
# (legacy rows projected before T109's 0014 schema migration) are
|
||||||
|
# included unconditionally — same policy as the FTS leg.
|
||||||
|
from chat.state.branches import _NO_HEAD_CLAMP, active_branch_event_ids
|
||||||
|
|
||||||
|
vec_origin, vec_head = active_branch_event_ids(conn)
|
||||||
|
if vec_hits and (vec_origin > 0 or vec_head < _NO_HEAD_CLAMP):
|
||||||
|
vec_ids = [h["memory_id"] for h in vec_hits]
|
||||||
|
placeholders_v = ",".join("?" * len(vec_ids))
|
||||||
|
in_range = {
|
||||||
|
row[0]
|
||||||
|
for row in conn.execute(
|
||||||
|
f"SELECT id FROM memories "
|
||||||
|
f"WHERE id IN ({placeholders_v}) "
|
||||||
|
f" AND (event_id IS NULL OR event_id BETWEEN ? AND ?)",
|
||||||
|
(*vec_ids, vec_origin, vec_head),
|
||||||
|
).fetchall()
|
||||||
|
}
|
||||||
|
vec_hits = [h for h in vec_hits if h["memory_id"] in in_range]
|
||||||
vec_rank_by_id: dict[int, int] = {
|
vec_rank_by_id: dict[int, int] = {
|
||||||
hit["memory_id"]: rank for rank, hit in enumerate(vec_hits)
|
hit["memory_id"]: rank for rank, hit in enumerate(vec_hits)
|
||||||
}
|
}
|
||||||
@@ -343,10 +406,7 @@ def _rrf_fuse_and_rerank(
|
|||||||
|
|
||||||
# Final composite re-rank: significance + recency boosts on top of the
|
# Final composite re-rank: significance + recency boosts on top of the
|
||||||
# negated fusion score so the sort direction matches the FTS-only path.
|
# negated fusion score so the sort direction matches the FTS-only path.
|
||||||
max_id_row = conn.execute(
|
max_id = _max_event_id(conn, owner_id)
|
||||||
"SELECT MAX(id) FROM memories WHERE owner_id = ?", (owner_id,)
|
|
||||||
).fetchone()
|
|
||||||
max_id = max_id_row[0] if max_id_row and max_id_row[0] else 1
|
|
||||||
|
|
||||||
result_cols = cols + ["fts_rank"]
|
result_cols = cols + ["fts_rank"]
|
||||||
enriched: list[dict] = []
|
enriched: list[dict] = []
|
||||||
|
|||||||
+291
-7
@@ -5,7 +5,12 @@ body {
|
|||||||
color: #1c1c1c;
|
color: #1c1c1c;
|
||||||
background: #fafafa;
|
background: #fafafa;
|
||||||
display: flex;
|
display: flex;
|
||||||
min-height: 100vh;
|
/* Locked to viewport (was ``min-height: 100vh``) so flex children
|
||||||
|
like the chat ``.timeline`` get a bounded height and can use
|
||||||
|
``overflow-y: auto`` to scroll independently. The other pages
|
||||||
|
have ``.content`` with ``overflow: auto`` so their own
|
||||||
|
overflow still scrolls inside the right pane. */
|
||||||
|
height: 100vh;
|
||||||
}
|
}
|
||||||
.rail {
|
.rail {
|
||||||
width: 200px;
|
width: 200px;
|
||||||
@@ -101,12 +106,291 @@ code { font-family: ui-monospace, "SF Mono", Menlo, monospace; }
|
|||||||
}
|
}
|
||||||
.turn-input { display: flex; flex-direction: column; gap: 8px; padding-top: 12px; border-top: 1px solid #e5e5e5; }
|
.turn-input { display: flex; flex-direction: column; gap: 8px; padding-top: 12px; border-top: 1px solid #e5e5e5; }
|
||||||
.turn-input textarea { padding: 8px; font: inherit; border: 1px solid #ccc; border-radius: 3px; resize: vertical; }
|
.turn-input textarea { padding: 8px; font: inherit; border: 1px solid #ccc; border-radius: 3px; resize: vertical; }
|
||||||
.drawer { position: fixed; top: 0; right: 0; width: 360px; height: 100vh; background: #fff; border-left: 1px solid #e5e5e5; padding: 16px; overflow-y: auto; z-index: 10; }
|
/* ===========================================================
|
||||||
.drawer[hidden] { display: none; }
|
Drawer — director's notebook overlay
|
||||||
.drawer-content { display: flex; flex-direction: column; gap: 16px; }
|
===========================================================
|
||||||
.drawer-header { display: flex; align-items: center; justify-content: space-between; padding-bottom: 8px; border-bottom: 1px solid #e5e5e5; }
|
Editorial popup design: a warm-paper panel floats over an inky
|
||||||
.drawer-close { border: none; background: transparent; color: #1c1c1c; font-size: 24px; padding: 0 4px; cursor: pointer; }
|
blurred backdrop. Single accent serif (Newsreader) at the title,
|
||||||
.drawer-section h3 { margin: 0 0 8px; font-size: 14px; text-transform: uppercase; letter-spacing: 0.5px; color: #666; }
|
single muted-amber accent for primary interactives, generous
|
||||||
|
spacing, controlled motion.
|
||||||
|
|
||||||
|
Design tokens (scoped to the drawer so the rest of the app stays
|
||||||
|
on its existing palette).
|
||||||
|
*/
|
||||||
|
.drawer-modal {
|
||||||
|
--paper: #f6f1e8; /* warm off-white panel */
|
||||||
|
--paper-edge: #e7dfce;
|
||||||
|
--ink: #1a1d29; /* deep ink-blue */
|
||||||
|
--ink-soft: #38405a;
|
||||||
|
--ink-faint: #6c7390;
|
||||||
|
--accent: #b97e30; /* muted amber */
|
||||||
|
--accent-soft: #efd9b1;
|
||||||
|
--rule: rgba(26, 29, 41, 0.10);
|
||||||
|
--shadow-near: 0 1px 2px rgba(26, 29, 41, 0.08);
|
||||||
|
--shadow-far: 0 32px 64px -24px rgba(26, 29, 41, 0.45),
|
||||||
|
0 12px 24px -12px rgba(26, 29, 41, 0.25);
|
||||||
|
--serif: "Newsreader", "Iowan Old Style", Georgia, serif;
|
||||||
|
--duration: 180ms;
|
||||||
|
--ease: cubic-bezier(0.22, 0.61, 0.36, 1);
|
||||||
|
|
||||||
|
position: fixed;
|
||||||
|
inset: 0;
|
||||||
|
z-index: 100;
|
||||||
|
display: flex;
|
||||||
|
align-items: center;
|
||||||
|
justify-content: center;
|
||||||
|
padding: clamp(16px, 4vw, 48px);
|
||||||
|
/* Open/close transitions live here so the backdrop and panel
|
||||||
|
can fade together; .is-open promotes both to their visible
|
||||||
|
end-states. */
|
||||||
|
opacity: 0;
|
||||||
|
transition: opacity var(--duration) var(--ease);
|
||||||
|
}
|
||||||
|
.drawer-modal[hidden] { display: none; }
|
||||||
|
.drawer-modal.is-open { opacity: 1; }
|
||||||
|
|
||||||
|
.drawer-modal-backdrop {
|
||||||
|
position: absolute;
|
||||||
|
inset: 0;
|
||||||
|
background:
|
||||||
|
radial-gradient(circle at 30% 25%, rgba(26, 29, 41, 0.55), rgba(26, 29, 41, 0.85) 75%);
|
||||||
|
backdrop-filter: blur(6px) saturate(1.05);
|
||||||
|
-webkit-backdrop-filter: blur(6px) saturate(1.05);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* The chat behind the modal stops scrolling and loses focus
|
||||||
|
entirely. body class set by the JS; resets on close. */
|
||||||
|
body.drawer-modal-open { overflow: hidden; }
|
||||||
|
|
||||||
|
.drawer-panel {
|
||||||
|
position: relative;
|
||||||
|
width: 100%;
|
||||||
|
max-width: 720px;
|
||||||
|
max-height: min(82vh, 760px);
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
background: var(--paper);
|
||||||
|
border-radius: 6px;
|
||||||
|
box-shadow: var(--shadow-far);
|
||||||
|
/* Subtle warm-paper texture: a single soft inner highlight at the
|
||||||
|
top edge plus a faint vignette toward the bottom. Cheap, no
|
||||||
|
external image. */
|
||||||
|
background-image:
|
||||||
|
linear-gradient(180deg,
|
||||||
|
rgba(255, 255, 255, 0.50) 0%,
|
||||||
|
rgba(255, 255, 255, 0.00) 18%,
|
||||||
|
rgba(0, 0, 0, 0.00) 80%,
|
||||||
|
rgba(120, 100, 70, 0.06) 100%);
|
||||||
|
/* A 1px ink rule at the very top, set INSIDE the radius so the
|
||||||
|
corners stay clean. ::before serves as a hairline accent. */
|
||||||
|
overflow: hidden;
|
||||||
|
/* Open/close: the backdrop fades; the panel additionally lifts
|
||||||
|
slightly and scales from 98% to 100%. Controlled, no bounce. */
|
||||||
|
transform: translateY(8px) scale(0.98);
|
||||||
|
transition:
|
||||||
|
transform var(--duration) var(--ease),
|
||||||
|
opacity var(--duration) var(--ease);
|
||||||
|
opacity: 0.98;
|
||||||
|
}
|
||||||
|
.drawer-modal.is-open .drawer-panel {
|
||||||
|
transform: translateY(0) scale(1);
|
||||||
|
opacity: 1;
|
||||||
|
}
|
||||||
|
.drawer-panel::before {
|
||||||
|
content: "";
|
||||||
|
position: absolute;
|
||||||
|
top: 0; left: 0; right: 0;
|
||||||
|
height: 2px;
|
||||||
|
background: linear-gradient(90deg,
|
||||||
|
transparent 0%, var(--accent) 14%, var(--accent) 86%, transparent 100%);
|
||||||
|
opacity: 0.85;
|
||||||
|
}
|
||||||
|
|
||||||
|
.drawer-panel-header {
|
||||||
|
display: flex;
|
||||||
|
align-items: baseline;
|
||||||
|
justify-content: space-between;
|
||||||
|
gap: 16px;
|
||||||
|
padding: 22px 28px 14px;
|
||||||
|
border-bottom: 1px solid var(--rule);
|
||||||
|
flex-shrink: 0;
|
||||||
|
}
|
||||||
|
.drawer-panel-header h2 {
|
||||||
|
margin: 0;
|
||||||
|
font-family: var(--serif);
|
||||||
|
font-weight: 500;
|
||||||
|
font-size: clamp(22px, 2.4vw, 28px);
|
||||||
|
letter-spacing: -0.01em;
|
||||||
|
color: var(--ink);
|
||||||
|
/* Tiny editorial flourish: lowercase the title so it reads like
|
||||||
|
a column header in a printed broadside. */
|
||||||
|
text-transform: lowercase;
|
||||||
|
}
|
||||||
|
.drawer-panel-header h2::after {
|
||||||
|
content: "";
|
||||||
|
display: inline-block;
|
||||||
|
width: 6px;
|
||||||
|
height: 6px;
|
||||||
|
margin-left: 10px;
|
||||||
|
border-radius: 50%;
|
||||||
|
background: var(--accent);
|
||||||
|
vertical-align: middle;
|
||||||
|
transform: translateY(-2px);
|
||||||
|
}
|
||||||
|
|
||||||
|
.drawer-panel-close {
|
||||||
|
appearance: none;
|
||||||
|
background: transparent;
|
||||||
|
border: none;
|
||||||
|
border-radius: 4px;
|
||||||
|
color: var(--ink-soft);
|
||||||
|
font-family: var(--serif);
|
||||||
|
font-size: 28px;
|
||||||
|
line-height: 1;
|
||||||
|
width: 36px;
|
||||||
|
height: 36px;
|
||||||
|
cursor: pointer;
|
||||||
|
transition:
|
||||||
|
background-color var(--duration) var(--ease),
|
||||||
|
color var(--duration) var(--ease),
|
||||||
|
transform var(--duration) var(--ease);
|
||||||
|
}
|
||||||
|
.drawer-panel-close:hover {
|
||||||
|
background: rgba(26, 29, 41, 0.06);
|
||||||
|
color: var(--ink);
|
||||||
|
transform: rotate(90deg);
|
||||||
|
}
|
||||||
|
.drawer-panel-close:focus-visible {
|
||||||
|
outline: 2px solid var(--accent);
|
||||||
|
outline-offset: 2px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.drawer-panel-body {
|
||||||
|
flex: 1 1 auto;
|
||||||
|
min-height: 0;
|
||||||
|
overflow-y: auto;
|
||||||
|
padding: 18px 28px 28px;
|
||||||
|
/* Restrict typography inside the body to the existing app font
|
||||||
|
so the existing drawer markup (forms, lists, buttons rendered
|
||||||
|
by /chats/<id>/drawer) keeps its current density and read-flow.
|
||||||
|
We only re-color a few items so they sit on the warm paper. */
|
||||||
|
color: var(--ink);
|
||||||
|
}
|
||||||
|
.drawer-panel-body .drawer-panel-loading {
|
||||||
|
font-family: var(--serif);
|
||||||
|
font-style: italic;
|
||||||
|
color: var(--ink-faint);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Scoped overrides for the drawer-content the server renders into
|
||||||
|
.drawer-panel-body. Keeps the existing class names working but
|
||||||
|
re-tunes them for the warm-paper context. */
|
||||||
|
/* Tabs nav — sits at the top of .drawer-content and lets the user
|
||||||
|
pivot between Scene / Cast / Story / Turns groups. Underline-style
|
||||||
|
active indicator (a single muted-amber rule) keeps the editorial
|
||||||
|
feel — no pills, no boxes, no hover-fills. */
|
||||||
|
.drawer-panel-body .drawer-tabs {
|
||||||
|
display: flex;
|
||||||
|
gap: 6px;
|
||||||
|
margin: 0 -8px 14px; /* bleed the divider rule slightly past the body padding */
|
||||||
|
padding: 0 8px 0;
|
||||||
|
border-bottom: 1px solid var(--rule);
|
||||||
|
flex-wrap: wrap;
|
||||||
|
}
|
||||||
|
.drawer-panel-body .drawer-tab {
|
||||||
|
appearance: none;
|
||||||
|
background: transparent;
|
||||||
|
border: none;
|
||||||
|
padding: 10px 14px 12px;
|
||||||
|
margin-bottom: -1px; /* sit on top of the parent's border-bottom */
|
||||||
|
font-family: var(--serif);
|
||||||
|
font-size: 15px;
|
||||||
|
font-weight: 400;
|
||||||
|
letter-spacing: 0.02em;
|
||||||
|
color: var(--ink-faint);
|
||||||
|
border-bottom: 2px solid transparent;
|
||||||
|
cursor: pointer;
|
||||||
|
transition:
|
||||||
|
color var(--duration) var(--ease),
|
||||||
|
border-color var(--duration) var(--ease);
|
||||||
|
border-radius: 0; /* strip the global button radius */
|
||||||
|
}
|
||||||
|
.drawer-panel-body .drawer-tab:hover {
|
||||||
|
color: var(--ink);
|
||||||
|
background: transparent;
|
||||||
|
border-color: transparent;
|
||||||
|
}
|
||||||
|
.drawer-panel-body .drawer-tab.is-active {
|
||||||
|
color: var(--ink);
|
||||||
|
border-bottom-color: var(--accent);
|
||||||
|
background: transparent;
|
||||||
|
}
|
||||||
|
.drawer-panel-body .drawer-tab.is-active:hover {
|
||||||
|
background: transparent;
|
||||||
|
color: var(--ink);
|
||||||
|
}
|
||||||
|
.drawer-panel-body .drawer-tab:focus-visible {
|
||||||
|
outline: 2px solid var(--accent);
|
||||||
|
outline-offset: 2px;
|
||||||
|
border-radius: 2px;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Panes — only one visible at a time. Uses [hidden] so the JS can
|
||||||
|
toggle attribute-driven instead of class-driven. */
|
||||||
|
.drawer-panel-body .drawer-tab-pane[hidden] { display: none; }
|
||||||
|
|
||||||
|
/* Sections inside a pane: drop the section-level rules since the
|
||||||
|
tabs already segment the content. Keep the section h3 as a sub-
|
||||||
|
heading inside its pane — useful when a tab groups multiple
|
||||||
|
sections (e.g. Cast = Guest + Group + Edges). */
|
||||||
|
.drawer-panel-body .drawer-section {
|
||||||
|
padding: 14px 0 18px;
|
||||||
|
border-bottom: 1px solid var(--rule);
|
||||||
|
}
|
||||||
|
.drawer-panel-body .drawer-tab-pane > .drawer-section:first-child { padding-top: 6px; }
|
||||||
|
.drawer-panel-body .drawer-tab-pane > .drawer-section:last-child { border-bottom: none; padding-bottom: 4px; }
|
||||||
|
/* When a pane has only one section, suppress the redundant h3 since
|
||||||
|
the tab label is the same name. */
|
||||||
|
.drawer-panel-body .drawer-tab-pane:has(> .drawer-section:only-child) > .drawer-section > h3 {
|
||||||
|
display: none;
|
||||||
|
}
|
||||||
|
.drawer-panel-body .drawer-section h3 {
|
||||||
|
margin: 0 0 10px;
|
||||||
|
font-family: var(--serif);
|
||||||
|
font-weight: 500;
|
||||||
|
font-size: 12px;
|
||||||
|
letter-spacing: 0.16em;
|
||||||
|
text-transform: uppercase;
|
||||||
|
color: var(--accent);
|
||||||
|
}
|
||||||
|
.drawer-panel-body .activity-row,
|
||||||
|
.drawer-panel-body .edge-row { margin-bottom: 12px; }
|
||||||
|
.drawer-panel-body .activity-row strong,
|
||||||
|
.drawer-panel-body .edge-row strong { display: block; color: var(--ink); }
|
||||||
|
.drawer-panel-body .muted { color: var(--ink-faint); }
|
||||||
|
.drawer-panel-body button,
|
||||||
|
.drawer-panel-body .btn {
|
||||||
|
background: var(--ink);
|
||||||
|
border: 1px solid var(--ink);
|
||||||
|
color: var(--paper);
|
||||||
|
border-radius: 3px;
|
||||||
|
}
|
||||||
|
.drawer-panel-body button:hover,
|
||||||
|
.drawer-panel-body .btn:hover {
|
||||||
|
background: var(--accent);
|
||||||
|
border-color: var(--accent);
|
||||||
|
color: var(--ink);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Respect reduced-motion preference: no scale, no rotate, no
|
||||||
|
blur transition — just the opacity fade. */
|
||||||
|
@media (prefers-reduced-motion: reduce) {
|
||||||
|
.drawer-modal,
|
||||||
|
.drawer-panel,
|
||||||
|
.drawer-panel-close { transition-duration: 0ms; }
|
||||||
|
.drawer-panel { transform: none; }
|
||||||
|
.drawer-panel-close:hover { transform: none; }
|
||||||
|
}
|
||||||
.activity-row, .edge-row { margin-bottom: 12px; }
|
.activity-row, .edge-row { margin-bottom: 12px; }
|
||||||
.activity-row strong, .edge-row strong { display: block; }
|
.activity-row strong, .edge-row strong { display: block; }
|
||||||
.memory-list { list-style: none; padding: 0; margin: 0; }
|
.memory-list { list-style: none; padding: 0; margin: 0; }
|
||||||
|
|||||||
@@ -0,0 +1,34 @@
|
|||||||
|
{# T110.3: delete-impact modal partial.
|
||||||
|
|
||||||
|
Rendered from :func:`chat.web.drawer.delete_preview` via a Jinja2
|
||||||
|
TemplateResponse so HTML autoescape covers user-controllable fields
|
||||||
|
(item.kind, item.description, notes) automatically — the prior
|
||||||
|
f-string assembly required explicit html.escape() calls (T110.2)
|
||||||
|
which become redundant under autoescape.
|
||||||
|
|
||||||
|
Inputs:
|
||||||
|
``chat_id`` — the URL chat id (used to build the confirm form action).
|
||||||
|
``impact`` — an :class:`~chat.services.delete_impact.ImpactReport`.
|
||||||
|
#}
|
||||||
|
<div class="delete-impact-modal">
|
||||||
|
<h3>Delete event {{ impact.target_event_id }}?</h3>
|
||||||
|
<p>This will discard {{ impact.cascading|length }} events. Cascade:</p>
|
||||||
|
<ul class="delete-impact-cascade">
|
||||||
|
{% if impact.cascading %}
|
||||||
|
{% for item in impact.cascading %}
|
||||||
|
<li><strong>{{ item.kind }}</strong>: {{ item.description }}</li>
|
||||||
|
{% endfor %}
|
||||||
|
{% else %}
|
||||||
|
<li>none</li>
|
||||||
|
{% endif %}
|
||||||
|
</ul>
|
||||||
|
<ul class="delete-impact-notes">
|
||||||
|
{% for note in impact.notes %}
|
||||||
|
<li>{{ note }}</li>
|
||||||
|
{% endfor %}
|
||||||
|
</ul>
|
||||||
|
<form hx-post="/chats/{{ chat_id }}/drawer/turn/delete/{{ impact.target_event_id }}"
|
||||||
|
hx-target="#drawer" hx-swap="innerHTML">
|
||||||
|
<button type="submit">Confirm delete</button>
|
||||||
|
</form>
|
||||||
|
</div>
|
||||||
@@ -547,6 +547,25 @@
|
|||||||
</ul>
|
</ul>
|
||||||
</details>
|
</details>
|
||||||
{% endif %}
|
{% endif %}
|
||||||
|
{# T110.4: bulk significance re-rate. Move every memory in this chat
|
||||||
|
at level_from to level_to with one manual_edit event per row, so
|
||||||
|
the audit trail stays per-memory. #}
|
||||||
|
<details class="bulk-significance">
|
||||||
|
<summary>Bulk re-rate significance</summary>
|
||||||
|
<form class="inline-edit"
|
||||||
|
hx-post="/chats/{{ chat.id }}/drawer/memory/significance/bulk"
|
||||||
|
hx-target="#drawer" hx-swap="innerHTML">
|
||||||
|
<label>
|
||||||
|
From:
|
||||||
|
<input type="number" name="level_from" min="0" max="3" value="0" required>
|
||||||
|
</label>
|
||||||
|
<label>
|
||||||
|
To:
|
||||||
|
<input type="number" name="level_to" min="0" max="3" value="1" required>
|
||||||
|
</label>
|
||||||
|
<button type="submit">Re-rate all</button>
|
||||||
|
</form>
|
||||||
|
</details>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section class="drawer-section">
|
<section class="drawer-section">
|
||||||
|
|||||||
@@ -5,7 +5,18 @@
|
|||||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||||
<title>{% block title %}chat{% endblock %}</title>
|
<title>{% block title %}chat{% endblock %}</title>
|
||||||
<link rel="stylesheet" href="/static/app.css">
|
<link rel="stylesheet" href="/static/app.css">
|
||||||
|
<!-- Newsreader: refined editorial serif for accent typography
|
||||||
|
(drawer modal title, etc.). Body stays system-ui for read-
|
||||||
|
flow legibility. Subset to the weight we use to keep the
|
||||||
|
payload tiny. -->
|
||||||
|
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||||
|
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||||
|
<link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Newsreader:opsz,wght@6..72,400;6..72,500&display=swap">
|
||||||
<script src="https://unpkg.com/htmx.org@1.9.12" defer></script>
|
<script src="https://unpkg.com/htmx.org@1.9.12" defer></script>
|
||||||
|
<!-- htmx 1.x bundles its SSE extension at /dist/ext/sse.js. The
|
||||||
|
standalone htmx-ext-sse@2.x package is for htmx 2.x and is
|
||||||
|
not compatible with the 1.x ext API. -->
|
||||||
|
<script src="https://unpkg.com/htmx.org@1.9.12/dist/ext/sse.js" defer></script>
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
{% block body %}{% endblock %}
|
{% block body %}{% endblock %}
|
||||||
|
|||||||
+358
-20
@@ -7,7 +7,9 @@
|
|||||||
<header class="chat-header">
|
<header class="chat-header">
|
||||||
<h1>{{ host_bot.name }}</h1>
|
<h1>{{ host_bot.name }}</h1>
|
||||||
<div class="chat-meta muted">{{ chat.time }}</div>
|
<div class="chat-meta muted">{{ chat.time }}</div>
|
||||||
<button class="drawer-toggle" type="button" aria-controls="drawer" aria-expanded="false">Drawer</button>
|
<button class="drawer-toggle" type="button"
|
||||||
|
aria-controls="drawer-modal" aria-expanded="false"
|
||||||
|
aria-haspopup="dialog">Drawer</button>
|
||||||
</header>
|
</header>
|
||||||
|
|
||||||
<section class="timeline" id="timeline"
|
<section class="timeline" id="timeline"
|
||||||
@@ -30,21 +32,251 @@
|
|||||||
<button type="submit">Send</button>
|
<button type="submit">Send</button>
|
||||||
</form>
|
</form>
|
||||||
|
|
||||||
<aside class="drawer" id="drawer" hidden
|
|
||||||
hx-get="/chats/{{ chat.id }}/drawer"
|
|
||||||
hx-trigger="revealed"
|
|
||||||
hx-swap="innerHTML">
|
|
||||||
<p class="muted">Loading drawer…</p>
|
|
||||||
</aside>
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<!-- Drawer modal — director's notebook overlay.
|
||||||
|
Sits outside .chat-shell so its position:fixed backdrop covers the
|
||||||
|
whole viewport. The panel still pulls its inner HTML from
|
||||||
|
/chats/<id>/drawer via HTMX; trigger is a custom 'drawer-open'
|
||||||
|
event that the open/close script dispatches each time the modal
|
||||||
|
opens, so the content refreshes on every open. -->
|
||||||
|
<div class="drawer-modal" id="drawer-modal" hidden
|
||||||
|
role="dialog"
|
||||||
|
aria-modal="true"
|
||||||
|
aria-labelledby="drawer-modal-title">
|
||||||
|
<div class="drawer-modal-backdrop" data-drawer-close></div>
|
||||||
|
<article class="drawer-panel">
|
||||||
|
<header class="drawer-panel-header">
|
||||||
|
<h2 id="drawer-modal-title">Drawer</h2>
|
||||||
|
<button class="drawer-panel-close" type="button"
|
||||||
|
data-drawer-close
|
||||||
|
aria-label="Close drawer">×</button>
|
||||||
|
</header>
|
||||||
|
<div class="drawer-panel-body" id="drawer"
|
||||||
|
hx-get="/chats/{{ chat.id }}/drawer"
|
||||||
|
hx-trigger="drawer-open from:body"
|
||||||
|
hx-swap="innerHTML">
|
||||||
|
<p class="muted drawer-panel-loading">Loading…</p>
|
||||||
|
</div>
|
||||||
|
</article>
|
||||||
|
</div>
|
||||||
|
|
||||||
<script>
|
<script>
|
||||||
document.querySelector('.drawer-toggle')?.addEventListener('click', (e) => {
|
// Drawer modal — open/close, focus management, and post-swap
|
||||||
const drawer = document.getElementById('drawer');
|
// tab-grouping. The server's /chats/<id>/drawer response is left
|
||||||
const isHidden = drawer.hasAttribute('hidden');
|
// unchanged; this script post-processes the swapped HTML to:
|
||||||
if (isHidden) drawer.removeAttribute('hidden');
|
// 1. Pull the bot name from the legacy <header><h2> and use it as
|
||||||
else drawer.setAttribute('hidden', '');
|
// the modal title.
|
||||||
e.target.setAttribute('aria-expanded', String(isHidden));
|
// 2. Remove the legacy header (it has its own onclick="hidden"
|
||||||
});
|
// close that targets the OLD drawer semantics — broken now).
|
||||||
|
// 3. Walk .drawer-section blocks and group them into 4 tabs by
|
||||||
|
// their <h3> title:
|
||||||
|
// Scene : Scene, Activity
|
||||||
|
// Cast : Guest, Group, Edges
|
||||||
|
// Story : Events, Threads, Branches
|
||||||
|
// Turns : Recent turns, Significance review
|
||||||
|
// A tab nav is rendered above the sections; clicking switches
|
||||||
|
// which group is visible. Empty tabs (no matching sections) are
|
||||||
|
// hidden.
|
||||||
|
(function () {
|
||||||
|
const modal = document.getElementById('drawer-modal');
|
||||||
|
const toggle = document.querySelector('.drawer-toggle');
|
||||||
|
if (!modal || !toggle) return;
|
||||||
|
const titleEl = modal.querySelector('#drawer-modal-title');
|
||||||
|
const body = modal.querySelector('.drawer-panel-body');
|
||||||
|
const panel = modal.querySelector('.drawer-panel');
|
||||||
|
|
||||||
|
let lastFocus = null;
|
||||||
|
|
||||||
|
function open() {
|
||||||
|
if (!modal.hasAttribute('hidden')) return;
|
||||||
|
lastFocus = document.activeElement;
|
||||||
|
modal.removeAttribute('hidden');
|
||||||
|
// Force reflow so the .is-open class triggers the transition.
|
||||||
|
void modal.offsetWidth;
|
||||||
|
modal.classList.add('is-open');
|
||||||
|
toggle.setAttribute('aria-expanded', 'true');
|
||||||
|
document.body.classList.add('drawer-modal-open');
|
||||||
|
// Re-fetch drawer content via the panel's hx-trigger.
|
||||||
|
document.body.dispatchEvent(new CustomEvent('drawer-open'));
|
||||||
|
// Focus the close button so Escape / Enter both work
|
||||||
|
// immediately and screen readers announce the dialog.
|
||||||
|
requestAnimationFrame(() => {
|
||||||
|
const closeBtn = modal.querySelector('.drawer-panel-close');
|
||||||
|
if (closeBtn) closeBtn.focus();
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function close() {
|
||||||
|
if (modal.hasAttribute('hidden')) return;
|
||||||
|
modal.classList.remove('is-open');
|
||||||
|
toggle.setAttribute('aria-expanded', 'false');
|
||||||
|
document.body.classList.remove('drawer-modal-open');
|
||||||
|
// Wait for the fade-out before fully hiding so the transition
|
||||||
|
// can play. Match the CSS duration.
|
||||||
|
setTimeout(() => {
|
||||||
|
modal.setAttribute('hidden', '');
|
||||||
|
if (lastFocus && typeof lastFocus.focus === 'function') {
|
||||||
|
lastFocus.focus();
|
||||||
|
}
|
||||||
|
}, 180);
|
||||||
|
}
|
||||||
|
|
||||||
|
toggle.addEventListener('click', open);
|
||||||
|
|
||||||
|
// Bind close DIRECTLY to every element flagged data-drawer-close.
|
||||||
|
// Event delegation through .stopPropagation() previously swallowed
|
||||||
|
// the close button's click (it sits inside .drawer-panel, which
|
||||||
|
// stops propagation to keep backdrop clicks from leaking through
|
||||||
|
// the panel itself). Direct binding sidesteps that and keeps the
|
||||||
|
// panel-stops-propagation rule for everything else.
|
||||||
|
function bindCloseTargets(root) {
|
||||||
|
root.querySelectorAll('[data-drawer-close]').forEach((el) => {
|
||||||
|
// Idempotent: only bind once per element.
|
||||||
|
if (el.dataset.drawerCloseBound === '1') return;
|
||||||
|
el.dataset.drawerCloseBound = '1';
|
||||||
|
el.addEventListener('click', (e) => {
|
||||||
|
e.preventDefault();
|
||||||
|
e.stopPropagation();
|
||||||
|
close();
|
||||||
|
});
|
||||||
|
});
|
||||||
|
}
|
||||||
|
bindCloseTargets(modal);
|
||||||
|
// Clicks inside the panel that AREN'T close targets must not
|
||||||
|
// reach the backdrop click handler. (We don't have one currently
|
||||||
|
// — backdrop close is via data-drawer-close on the backdrop div —
|
||||||
|
// but stopPropagation here is defensive against future handlers.)
|
||||||
|
panel.addEventListener('click', (e) => e.stopPropagation());
|
||||||
|
|
||||||
|
// Escape closes only when the modal is open.
|
||||||
|
document.addEventListener('keydown', (e) => {
|
||||||
|
if (e.key === 'Escape' && !modal.hasAttribute('hidden')) {
|
||||||
|
e.preventDefault();
|
||||||
|
close();
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
// ---- Tabs: group server-rendered .drawer-section blocks ----
|
||||||
|
|
||||||
|
const TAB_GROUPS = [
|
||||||
|
{ id: 'scene', label: 'Scene', sections: ['Scene', 'Activity'] },
|
||||||
|
{ id: 'cast', label: 'Cast', sections: ['Guest', 'Group', 'Edges'] },
|
||||||
|
{ id: 'story', label: 'Story', sections: ['Events', 'Threads', 'Branches'] },
|
||||||
|
{ id: 'turns', label: 'Turns', sections: ['Recent turns', 'Significance review'] },
|
||||||
|
];
|
||||||
|
|
||||||
|
function tabIdForSection(h3Text) {
|
||||||
|
const t = (h3Text || '').trim();
|
||||||
|
for (const g of TAB_GROUPS) {
|
||||||
|
if (g.sections.includes(t)) return g.id;
|
||||||
|
}
|
||||||
|
return 'scene'; // unknown sections fall into the first tab
|
||||||
|
}
|
||||||
|
|
||||||
|
function buildTabs() {
|
||||||
|
// Clean up the legacy server-rendered header inside the body
|
||||||
|
// (duplicate close + duplicate title).
|
||||||
|
const legacyHeader = body.querySelector(':scope > .drawer-content > .drawer-header');
|
||||||
|
if (legacyHeader) {
|
||||||
|
// Promote the bot name to the modal title before discarding.
|
||||||
|
const h2 = legacyHeader.querySelector('h2');
|
||||||
|
if (h2 && h2.textContent.trim()) {
|
||||||
|
titleEl.textContent = h2.textContent.trim();
|
||||||
|
}
|
||||||
|
legacyHeader.remove();
|
||||||
|
}
|
||||||
|
|
||||||
|
// The drawer-content wrapper holds all the sections. Group them.
|
||||||
|
const content = body.querySelector('.drawer-content');
|
||||||
|
if (!content) return;
|
||||||
|
|
||||||
|
const sections = Array.from(content.querySelectorAll(':scope > .drawer-section'));
|
||||||
|
if (sections.length === 0) return;
|
||||||
|
|
||||||
|
// Bucket sections by tab id.
|
||||||
|
const buckets = new Map(TAB_GROUPS.map((g) => [g.id, []]));
|
||||||
|
for (const sec of sections) {
|
||||||
|
const h3 = sec.querySelector(':scope > h3');
|
||||||
|
const tabId = tabIdForSection(h3 ? h3.textContent : '');
|
||||||
|
buckets.get(tabId).push(sec);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Build the tab nav. Skip empty buckets so the nav reflects
|
||||||
|
// what the chat actually has (e.g. no Guest tab when 1:1).
|
||||||
|
const nav = document.createElement('nav');
|
||||||
|
nav.className = 'drawer-tabs';
|
||||||
|
nav.setAttribute('role', 'tablist');
|
||||||
|
|
||||||
|
const panes = document.createElement('div');
|
||||||
|
panes.className = 'drawer-tab-panes';
|
||||||
|
|
||||||
|
let firstActive = null;
|
||||||
|
for (const group of TAB_GROUPS) {
|
||||||
|
const items = buckets.get(group.id);
|
||||||
|
if (!items.length) continue;
|
||||||
|
|
||||||
|
const btn = document.createElement('button');
|
||||||
|
btn.type = 'button';
|
||||||
|
btn.className = 'drawer-tab';
|
||||||
|
btn.setAttribute('role', 'tab');
|
||||||
|
btn.id = `drawer-tab-${group.id}`;
|
||||||
|
btn.dataset.tabTarget = group.id;
|
||||||
|
btn.textContent = group.label;
|
||||||
|
btn.setAttribute('aria-controls', `drawer-pane-${group.id}`);
|
||||||
|
nav.appendChild(btn);
|
||||||
|
|
||||||
|
const pane = document.createElement('section');
|
||||||
|
pane.className = 'drawer-tab-pane';
|
||||||
|
pane.id = `drawer-pane-${group.id}`;
|
||||||
|
pane.setAttribute('role', 'tabpanel');
|
||||||
|
pane.setAttribute('aria-labelledby', `drawer-tab-${group.id}`);
|
||||||
|
// Move the section nodes into the pane (preserves any HTMX
|
||||||
|
// event listeners and the sections' interactive forms).
|
||||||
|
for (const sec of items) pane.appendChild(sec);
|
||||||
|
panes.appendChild(pane);
|
||||||
|
|
||||||
|
if (!firstActive) firstActive = group.id;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Replace the existing content with [nav][panes].
|
||||||
|
content.innerHTML = '';
|
||||||
|
content.appendChild(nav);
|
||||||
|
content.appendChild(panes);
|
||||||
|
|
||||||
|
// Tab click handler.
|
||||||
|
nav.addEventListener('click', (e) => {
|
||||||
|
const target = e.target;
|
||||||
|
if (!(target instanceof HTMLElement)) return;
|
||||||
|
const tabId = target.dataset.tabTarget;
|
||||||
|
if (!tabId) return;
|
||||||
|
activateTab(content, tabId);
|
||||||
|
});
|
||||||
|
|
||||||
|
if (firstActive) activateTab(content, firstActive);
|
||||||
|
}
|
||||||
|
|
||||||
|
function activateTab(content, tabId) {
|
||||||
|
content.querySelectorAll('.drawer-tab').forEach((btn) => {
|
||||||
|
const isActive = btn.dataset.tabTarget === tabId;
|
||||||
|
btn.classList.toggle('is-active', isActive);
|
||||||
|
btn.setAttribute('aria-selected', String(isActive));
|
||||||
|
btn.setAttribute('tabindex', isActive ? '0' : '-1');
|
||||||
|
});
|
||||||
|
content.querySelectorAll('.drawer-tab-pane').forEach((pane) => {
|
||||||
|
const isActive = pane.id === `drawer-pane-${tabId}`;
|
||||||
|
pane.toggleAttribute('hidden', !isActive);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// Run after every HTMX swap into the panel body. Covers the
|
||||||
|
// initial open AND any subsequent server-driven re-render
|
||||||
|
// (e.g. an in-drawer form submit that returns refreshed HTML).
|
||||||
|
body.addEventListener('htmx:afterSwap', () => {
|
||||||
|
buildTabs();
|
||||||
|
bindCloseTargets(modal);
|
||||||
|
});
|
||||||
|
})();
|
||||||
</script>
|
</script>
|
||||||
<script>
|
<script>
|
||||||
// Streaming UX (T34): typing indicator, Stop button, send-lock,
|
// Streaming UX (T34): typing indicator, Stop button, send-lock,
|
||||||
@@ -66,6 +298,44 @@ document.querySelector('.drawer-toggle')?.addEventListener('click', (e) => {
|
|||||||
let isStreaming = false;
|
let isStreaming = false;
|
||||||
let typingEl = null;
|
let typingEl = null;
|
||||||
|
|
||||||
|
// Sticky-bottom autoscroll: scroll the timeline to the latest
|
||||||
|
// message when new content arrives, but ONLY if the user is
|
||||||
|
// already pinned to the bottom. Once they scroll up to read older
|
||||||
|
// turns, we leave their position alone until they manually scroll
|
||||||
|
// back down.
|
||||||
|
//
|
||||||
|
// ``isPinnedToBottom`` flips on every scroll event based on
|
||||||
|
// distance-from-bottom (with a small tolerance so a few pixels of
|
||||||
|
// overshoot from a layout shift doesn't unpin). A MutationObserver
|
||||||
|
// catches every node added to the timeline — covers the SSE-
|
||||||
|
// injected ``turn_html`` swap, the optimistic ``appendUserTurn``
|
||||||
|
// render, and the streaming typing-indicator updates.
|
||||||
|
const STICK_TOLERANCE_PX = 64;
|
||||||
|
let isPinnedToBottom = true;
|
||||||
|
|
||||||
|
function distanceFromBottom() {
|
||||||
|
return timeline.scrollHeight - timeline.scrollTop - timeline.clientHeight;
|
||||||
|
}
|
||||||
|
function scrollToBottom() {
|
||||||
|
timeline.scrollTop = timeline.scrollHeight;
|
||||||
|
}
|
||||||
|
// Initial state: stick to the bottom on page load so the latest
|
||||||
|
// turn is visible without manual scrolling.
|
||||||
|
requestAnimationFrame(scrollToBottom);
|
||||||
|
|
||||||
|
timeline.addEventListener('scroll', () => {
|
||||||
|
isPinnedToBottom = distanceFromBottom() <= STICK_TOLERANCE_PX;
|
||||||
|
}, { passive: true });
|
||||||
|
|
||||||
|
const timelineObserver = new MutationObserver(() => {
|
||||||
|
if (isPinnedToBottom) scrollToBottom();
|
||||||
|
});
|
||||||
|
timelineObserver.observe(timeline, {
|
||||||
|
childList: true,
|
||||||
|
subtree: true,
|
||||||
|
characterData: true, // streaming token-by-token edits
|
||||||
|
});
|
||||||
|
|
||||||
function ensureTypingEl() {
|
function ensureTypingEl() {
|
||||||
if (typingEl) return typingEl;
|
if (typingEl) return typingEl;
|
||||||
typingEl = document.createElement('div');
|
typingEl = document.createElement('div');
|
||||||
@@ -162,13 +432,62 @@ document.querySelector('.drawer-toggle')?.addEventListener('click', (e) => {
|
|||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
form.addEventListener('submit', () => {
|
// Enter-to-send (Shift+Enter for newline). Submits via the form's
|
||||||
isStreaming = true;
|
// own submit event so all the optimistic-render + fetch logic
|
||||||
|
// below applies uniformly to keyboard and click submissions.
|
||||||
|
if (textarea) {
|
||||||
|
textarea.addEventListener('keydown', (e) => {
|
||||||
|
if (e.key === 'Enter' && !e.shiftKey && !e.isComposing) {
|
||||||
|
e.preventDefault();
|
||||||
|
if (typeof form.requestSubmit === 'function') {
|
||||||
|
form.requestSubmit();
|
||||||
|
} else {
|
||||||
|
form.dispatchEvent(new Event('submit', { cancelable: true }));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// Render the user's prose optimistically as a turn-you DOM node.
|
||||||
|
// Without this the user can't see what they just sent until the page
|
||||||
|
// reloads — the server persists ``user_turn`` events but doesn't
|
||||||
|
// publish a turn_html for them (the SSE channel is bot-output-only).
|
||||||
|
function appendUserTurn(prose) {
|
||||||
|
const div = document.createElement('div');
|
||||||
|
div.className = 'turn turn-you';
|
||||||
|
const strong = document.createElement('strong');
|
||||||
|
strong.textContent = 'you';
|
||||||
|
const p = document.createElement('p');
|
||||||
|
p.textContent = prose;
|
||||||
|
div.appendChild(strong);
|
||||||
|
div.appendChild(p);
|
||||||
|
// Sending a message means the user wants to see it land — force
|
||||||
|
// sticky-bottom even if they were scrolled up reading older
|
||||||
|
// turns. The MutationObserver handles the actual scroll.
|
||||||
|
isPinnedToBottom = true;
|
||||||
|
timeline.appendChild(div);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Intercept the form submit and POST via fetch so we can:
|
||||||
|
// 1. Render the user's prose immediately (optimistic).
|
||||||
|
// 2. Clear the textarea immediately.
|
||||||
|
// 3. Keep the page state intact while the bot streams its
|
||||||
|
// response over SSE — vanilla form POST + 204 leaves the
|
||||||
|
// browser in a half-loaded state with the textarea unflushed.
|
||||||
|
form.addEventListener('submit', async (e) => {
|
||||||
|
e.preventDefault();
|
||||||
|
if (isStreaming) return;
|
||||||
|
const prose = textarea ? (textarea.value || '').trim() : '';
|
||||||
|
if (!prose) return;
|
||||||
|
|
||||||
|
appendUserTurn(prose);
|
||||||
|
if (textarea) {
|
||||||
|
textarea.value = '';
|
||||||
|
textarea.readOnly = true;
|
||||||
|
}
|
||||||
if (sendBtn) sendBtn.disabled = true;
|
if (sendBtn) sendBtn.disabled = true;
|
||||||
// readOnly (not disabled) — disabled fields are excluded from the
|
isStreaming = true;
|
||||||
// form submission, which would send prose="" and trigger the
|
|
||||||
// server's empty-prose 400.
|
|
||||||
if (textarea) textarea.readOnly = true;
|
|
||||||
if (!shell.querySelector('.stop-streaming')) {
|
if (!shell.querySelector('.stop-streaming')) {
|
||||||
const stopBtn = document.createElement('button');
|
const stopBtn = document.createElement('button');
|
||||||
stopBtn.type = 'button';
|
stopBtn.type = 'button';
|
||||||
@@ -186,6 +505,25 @@ document.querySelector('.drawer-toggle')?.addEventListener('click', (e) => {
|
|||||||
});
|
});
|
||||||
form.parentElement.insertBefore(stopBtn, form);
|
form.parentElement.insertBefore(stopBtn, form);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Fire the actual POST. The bot's response arrives via SSE
|
||||||
|
// (``turn_html`` event swaps into the timeline; ``unlock()`` runs
|
||||||
|
// on receipt to clear streaming state and re-enable the form).
|
||||||
|
try {
|
||||||
|
const body = new URLSearchParams({ prose }).toString();
|
||||||
|
const resp = await fetch(form.action, {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
|
||||||
|
body,
|
||||||
|
});
|
||||||
|
if (!resp.ok && resp.status !== 204) {
|
||||||
|
showBanner('send failed (HTTP ' + resp.status + ') — try again');
|
||||||
|
unlock();
|
||||||
|
}
|
||||||
|
} catch (err) {
|
||||||
|
showBanner('send failed — check your connection');
|
||||||
|
unlock();
|
||||||
|
}
|
||||||
});
|
});
|
||||||
})();
|
})();
|
||||||
</script>
|
</script>
|
||||||
|
|||||||
@@ -21,14 +21,29 @@
|
|||||||
<ul class="search-results">
|
<ul class="search-results">
|
||||||
{% for r in results %}
|
{% for r in results %}
|
||||||
<li class="search-result">
|
<li class="search-result">
|
||||||
<a class="search-result-link" href="/chats/{{ r.chat_id }}">
|
{# T111.2: deep-link to the originating turn via the
|
||||||
|
``id="turn-{event_id}"`` anchor stamped by Phase 3.5 T86.
|
||||||
|
``event_id`` may be NULL for memory rows projected before the
|
||||||
|
0014 migration ran (T109 did not backfill historical rows); in
|
||||||
|
that case fall back to a chat-level link with no anchor so we
|
||||||
|
never emit ``#turn-None``. #}
|
||||||
|
<a class="search-result-link"
|
||||||
|
href="/chats/{{ r.chat_id }}{% if r.event_id %}#turn-{{ r.event_id }}{% endif %}">
|
||||||
<div class="search-result-meta muted">
|
<div class="search-result-meta muted">
|
||||||
<strong>{{ r.owner_name }}</strong>
|
<strong>{{ r.owner_name }}</strong>
|
||||||
<span>· {{ r.chat_id }}</span>
|
<span>· {{ r.chat_id }}</span>
|
||||||
{% if r.chat_name %}<span>· {{ r.chat_name }}</span>{% endif %}
|
{% if r.chat_name %}<span>· {{ r.chat_name }}</span>{% endif %}
|
||||||
{% if r.scene_label %}<span>· scene {{ r.scene_label }}</span>{% endif %}
|
{% if r.scene_label %}<span>· scene {{ r.scene_label }}</span>{% endif %}
|
||||||
</div>
|
</div>
|
||||||
<div class="search-result-summary">{{ r.pov_summary }}</div>
|
{# T111.1: ``r.snippet`` is the FTS5 ``snippet()`` excerpt with
|
||||||
|
each match wrapped in ``<mark>...</mark>``. ``|safe`` is
|
||||||
|
required so the marker tags survive Jinja's auto-escape; the
|
||||||
|
snippet is built by SQLite from indexed text, so the only
|
||||||
|
HTML in the string is the ``<mark>`` we configured (any
|
||||||
|
special chars from the source content are passed through as
|
||||||
|
literal text, NOT as HTML). This is the only ``|safe`` filter
|
||||||
|
on the page — chat_id, owner_name, etc. remain auto-escaped. #}
|
||||||
|
<div class="search-result-summary">{{ r.snippet|safe }}</div>
|
||||||
</a>
|
</a>
|
||||||
</li>
|
</li>
|
||||||
{% endfor %}
|
{% endfor %}
|
||||||
|
|||||||
+27
-5
@@ -5,9 +5,9 @@ from fastapi.responses import RedirectResponse, HTMLResponse
|
|||||||
from fastapi.templating import Jinja2Templates
|
from fastapi.templating import Jinja2Templates
|
||||||
|
|
||||||
from chat.db.connection import open_db
|
from chat.db.connection import open_db
|
||||||
from chat.eventlog.log import append_event
|
from chat.eventlog.log import append_and_apply
|
||||||
from chat.eventlog.projector import project
|
from chat.state.entities import get_bot, list_bots
|
||||||
from chat.state.entities import list_bots
|
from chat.state.world import get_chat
|
||||||
|
|
||||||
TEMPLATES = Jinja2Templates(directory=str(Path(__file__).resolve().parent.parent / "templates"))
|
TEMPLATES = Jinja2Templates(directory=str(Path(__file__).resolve().parent.parent / "templates"))
|
||||||
|
|
||||||
@@ -108,11 +108,33 @@ async def bot_create(
|
|||||||
"initial_relationship_to_you": initial_relationship_to_you.strip(),
|
"initial_relationship_to_you": initial_relationship_to_you.strip(),
|
||||||
"kickoff_prose": kickoff_prose.strip(),
|
"kickoff_prose": kickoff_prose.strip(),
|
||||||
}
|
}
|
||||||
append_event(conn, kind="bot_authored", payload=payload)
|
# Per-event apply (NOT project()) — see docs/audits/2026-04-27-project-callers.md.
|
||||||
project(conn)
|
# ``project()`` replays the full log, which trips raw-INSERT handlers like
|
||||||
|
# ``_apply_chat_created`` once a second bot's events are present.
|
||||||
|
append_and_apply(conn, kind="bot_authored", payload=payload)
|
||||||
return RedirectResponse(url=f"/bots/{payload['id']}/kickoff", status_code=303)
|
return RedirectResponse(url=f"/bots/{payload['id']}/kickoff", status_code=303)
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/bots/{bot_id}")
|
||||||
|
async def bot_detail(bot_id: str, conn=Depends(get_conn)):
|
||||||
|
"""Click-through from the bots list. Routes to the bot's existing
|
||||||
|
chat when there is one (the v1 model is one-chat-per-host-bot,
|
||||||
|
keyed by ``chat_<bot_id>``), otherwise to the kickoff page so the
|
||||||
|
user can author the chat's opening state. 404 if the bot itself
|
||||||
|
doesn't exist.
|
||||||
|
|
||||||
|
Defined AFTER the ``/bots/new`` and ``/bots/{bot_id}/kickoff``
|
||||||
|
routes — FastAPI matches in declaration order, and a path
|
||||||
|
parameter would otherwise swallow ``/bots/new``.
|
||||||
|
"""
|
||||||
|
if get_bot(conn, bot_id) is None:
|
||||||
|
raise HTTPException(status_code=404, detail="bot not found")
|
||||||
|
chat_id = f"chat_{bot_id}"
|
||||||
|
if get_chat(conn, chat_id) is not None:
|
||||||
|
return RedirectResponse(url=f"/chats/{chat_id}", status_code=303)
|
||||||
|
return RedirectResponse(url=f"/bots/{bot_id}/kickoff", status_code=303)
|
||||||
|
|
||||||
|
|
||||||
@router.post("/bots/{bot_id}/reset")
|
@router.post("/bots/{bot_id}/reset")
|
||||||
async def reset_bot_route(
|
async def reset_bot_route(
|
||||||
bot_id: str,
|
bot_id: str,
|
||||||
|
|||||||
+81
-21
@@ -411,6 +411,64 @@ async def edit_memory_significance(
|
|||||||
return await drawer(chat_id, request, conn)
|
return await drawer(chat_id, request, conn)
|
||||||
|
|
||||||
|
|
||||||
|
@router.post(
|
||||||
|
"/chats/{chat_id}/drawer/memory/significance/bulk",
|
||||||
|
response_class=HTMLResponse,
|
||||||
|
)
|
||||||
|
async def bulk_re_rate_significance(
|
||||||
|
chat_id: str,
|
||||||
|
request: Request,
|
||||||
|
level_from: int = Form(...),
|
||||||
|
level_to: int = Form(...),
|
||||||
|
conn=Depends(get_conn),
|
||||||
|
):
|
||||||
|
"""T110.4: bulk re-rate every memory in this chat at ``level_from``
|
||||||
|
to ``level_to``.
|
||||||
|
|
||||||
|
Fans out into one ``manual_edit`` event per matching memory rather
|
||||||
|
than a single bulk event so the §6.4 audit trail stays per-row —
|
||||||
|
each affected memory carries its own ``prior_value -> new_value``
|
||||||
|
snapshot, so an inverse edit can restore an individual row without
|
||||||
|
needing to inspect a bulk payload's member list. The drawer's
|
||||||
|
significance-distribution panel surfaces the new buckets on the
|
||||||
|
refreshed partial.
|
||||||
|
|
||||||
|
Both levels are clamped to 0..3 (matching ``edit_memory_significance``)
|
||||||
|
and a no-op (``level_from == level_to``) is rejected with 400 so a
|
||||||
|
misclick can't pad the event log with empty edits.
|
||||||
|
"""
|
||||||
|
chat = get_chat(conn, chat_id)
|
||||||
|
if chat is None:
|
||||||
|
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
|
||||||
|
|
||||||
|
lf = max(0, min(3, int(level_from)))
|
||||||
|
lt = max(0, min(3, int(level_to)))
|
||||||
|
if lf == lt:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=400,
|
||||||
|
detail=f"level_from and level_to must differ (both = {lf})",
|
||||||
|
)
|
||||||
|
|
||||||
|
rows = conn.execute(
|
||||||
|
"SELECT id FROM memories WHERE chat_id = ? AND significance = ? "
|
||||||
|
"ORDER BY id ASC",
|
||||||
|
(chat_id, lf),
|
||||||
|
).fetchall()
|
||||||
|
for row in rows:
|
||||||
|
memory_id = int(row[0])
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="manual_edit",
|
||||||
|
payload={
|
||||||
|
"target_kind": "memory_significance",
|
||||||
|
"target_id": memory_id,
|
||||||
|
"prior_value": lf,
|
||||||
|
"new_value": lt,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
return await drawer(chat_id, request, conn)
|
||||||
|
|
||||||
|
|
||||||
@router.post(
|
@router.post(
|
||||||
"/chats/{chat_id}/drawer/memory/{memory_id}/pin",
|
"/chats/{chat_id}/drawer/memory/{memory_id}/pin",
|
||||||
response_class=HTMLResponse,
|
response_class=HTMLResponse,
|
||||||
@@ -1234,28 +1292,18 @@ async def delete_preview(
|
|||||||
|
|
||||||
report = compute_delete_impact(conn, target_event_id=int(event_id))
|
report = compute_delete_impact(conn, target_event_id=int(event_id))
|
||||||
|
|
||||||
# Build the modal HTML directly — the impact report is small and
|
# T110.3: render via the ``_delete_impact_modal.html`` Jinja partial
|
||||||
# reusing the drawer template would require a fragment include just
|
# so HTML autoescape covers user-controllable fields (item.kind,
|
||||||
# for this surface. Mirrors the rewind-preview style in
|
# item.description, notes) automatically. The prior implementation
|
||||||
# :func:`chat.web.turns.rewind_preview`.
|
# built the modal HTML via raw f-string concatenation and required
|
||||||
items_html = "".join(
|
# explicit ``html.escape()`` calls (T110.2) on each interpolated
|
||||||
f"<li><strong>{item.kind}</strong>: {item.description}</li>"
|
# field; under autoescape those calls become redundant. Mirrors the
|
||||||
for item in report.cascading
|
# rewind-preview style in :func:`chat.web.turns.rewind_preview`.
|
||||||
|
return TEMPLATES.TemplateResponse(
|
||||||
|
request,
|
||||||
|
"_delete_impact_modal.html",
|
||||||
|
{"chat_id": chat_id, "impact": report},
|
||||||
)
|
)
|
||||||
notes_html = "".join(f"<li>{note}</li>" for note in report.notes)
|
|
||||||
body = (
|
|
||||||
"<div class='delete-impact-modal'>"
|
|
||||||
f"<h3>Delete event {report.target_event_id}?</h3>"
|
|
||||||
f"<p>This will discard {len(report.cascading)} events. Cascade:</p>"
|
|
||||||
f"<ul class='delete-impact-cascade'>{items_html or '<li>none</li>'}</ul>"
|
|
||||||
f"<ul class='delete-impact-notes'>{notes_html}</ul>"
|
|
||||||
f"<form hx-post='/chats/{chat_id}/drawer/turn/delete/{report.target_event_id}' "
|
|
||||||
"hx-target='#drawer' hx-swap='innerHTML'>"
|
|
||||||
"<button type='submit'>Confirm delete</button>"
|
|
||||||
"</form>"
|
|
||||||
"</div>"
|
|
||||||
)
|
|
||||||
return HTMLResponse(body)
|
|
||||||
|
|
||||||
|
|
||||||
@router.post(
|
@router.post(
|
||||||
@@ -1278,7 +1326,19 @@ async def delete_turn(
|
|||||||
|
|
||||||
A snapshot is taken before truncation (inside ``execute_rewind``)
|
A snapshot is taken before truncation (inside ``execute_rewind``)
|
||||||
so the user can recover via the snapshot index.
|
so the user can recover via the snapshot index.
|
||||||
|
|
||||||
|
T110.1 guards ``event_id <= 0``: a stale tab or hand-crafted request
|
||||||
|
posting ``event_id=0`` would otherwise compute ``after_event_id=-1``
|
||||||
|
and silently truncate the entire log. ``id`` is auto-assigned by
|
||||||
|
SQLite starting at 1 so any caller's "real" id is always >= 1; a
|
||||||
|
zero or negative value can only mean a client bug, surfaced as 400.
|
||||||
"""
|
"""
|
||||||
|
if int(event_id) <= 0:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=400,
|
||||||
|
detail=f"event_id must be a positive integer, got {event_id}",
|
||||||
|
)
|
||||||
|
|
||||||
chat = get_chat(conn, chat_id)
|
chat = get_chat(conn, chat_id)
|
||||||
if chat is None:
|
if chat is None:
|
||||||
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
|
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
|
||||||
|
|||||||
+103
-18
@@ -17,8 +17,7 @@ from fastapi import APIRouter, Depends, Form, HTTPException, Request
|
|||||||
from fastapi.responses import HTMLResponse, RedirectResponse
|
from fastapi.responses import HTMLResponse, RedirectResponse
|
||||||
from fastapi.templating import Jinja2Templates
|
from fastapi.templating import Jinja2Templates
|
||||||
|
|
||||||
from chat.eventlog.log import append_event
|
from chat.eventlog.log import append_and_apply
|
||||||
from chat.eventlog.projector import project
|
|
||||||
from chat.llm.client import LLMClient
|
from chat.llm.client import LLMClient
|
||||||
from chat.services.kickoff import parse_kickoff
|
from chat.services.kickoff import parse_kickoff
|
||||||
from chat.state.entities import get_bot, get_you
|
from chat.state.entities import get_bot, get_you
|
||||||
@@ -32,14 +31,97 @@ router = APIRouter()
|
|||||||
|
|
||||||
|
|
||||||
def get_llm_client(request: Request) -> LLMClient:
|
def get_llm_client(request: Request) -> LLMClient:
|
||||||
"""Production LLM client. Tests override this via ``app.dependency_overrides``."""
|
"""Production LLM client. Tests override this via ``app.dependency_overrides``.
|
||||||
|
|
||||||
|
Returns a :class:`chat.llm.router.RoutedLLMClient` that splits
|
||||||
|
traffic: the narrative model goes to Featherless, the classifier
|
||||||
|
+ embeddings go to the local MLX server (``mlx-omni-server``).
|
||||||
|
Both backends share the OpenAI-compatible surface, so the routing
|
||||||
|
is invisible to call sites — they just pass ``model=...`` and the
|
||||||
|
router picks the backend.
|
||||||
|
"""
|
||||||
settings = request.app.state.settings
|
settings = request.app.state.settings
|
||||||
from chat.llm.featherless import FeatherlessClient
|
from chat.llm.featherless import FeatherlessClient
|
||||||
|
from chat.llm.local_mlx import LocalMLXClient
|
||||||
|
from chat.llm.router import RoutedLLMClient
|
||||||
|
|
||||||
return FeatherlessClient(
|
narrative = FeatherlessClient(
|
||||||
api_key=settings.featherless_api_key,
|
api_key=settings.featherless_api_key,
|
||||||
base_url=settings.featherless_base_url,
|
base_url=settings.featherless_base_url,
|
||||||
)
|
)
|
||||||
|
# Dedicated classifier client when a provider pin is configured —
|
||||||
|
# routes Llama-3.1-8B (or whatever ``classifier_model`` is) onto a
|
||||||
|
# specific upstream like Cerebras for ~10x throughput. When the
|
||||||
|
# pin is empty, ``classifier`` is None and the router falls back
|
||||||
|
# to the narrative client for classifier traffic.
|
||||||
|
classifier = None
|
||||||
|
if settings.classifier_provider_order:
|
||||||
|
classifier = FeatherlessClient(
|
||||||
|
api_key=settings.featherless_api_key,
|
||||||
|
base_url=settings.featherless_base_url,
|
||||||
|
default_extra_body={
|
||||||
|
"provider": {"order": list(settings.classifier_provider_order)}
|
||||||
|
},
|
||||||
|
)
|
||||||
|
local = LocalMLXClient(base_url=settings.local_mlx_base_url)
|
||||||
|
return RoutedLLMClient(
|
||||||
|
narrative=narrative,
|
||||||
|
classifier=classifier,
|
||||||
|
local=local,
|
||||||
|
narrative_model=settings.narrative_model,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _coerce_iso_time(value: str) -> str:
|
||||||
|
"""Permissive parser that returns a canonical ISO 8601 datetime.
|
||||||
|
|
||||||
|
The kickoff classifier (chat/services/kickoff.py) returns
|
||||||
|
``initial_time_iso`` as a free-form string; in practice it emits
|
||||||
|
things like ``"Sun 2024-05-12 07:00:00"``,
|
||||||
|
``"Tuesday, May 14, 2024 7:00 AM"``, or proper ISO. The strict
|
||||||
|
``datetime.fromisoformat`` would 400 on those, so this helper
|
||||||
|
tries a sequence of common classifier-emitted formats and
|
||||||
|
returns a canonical ``YYYY-MM-DDTHH:MM:SS+00:00`` form.
|
||||||
|
|
||||||
|
Raises ``ValueError`` when nothing parses, so the caller can 400
|
||||||
|
cleanly.
|
||||||
|
"""
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
|
||||||
|
s = (value or "").strip()
|
||||||
|
if not s:
|
||||||
|
return s
|
||||||
|
# Strict ISO first (covers "2026-04-26T20:00:00+00:00" and friends).
|
||||||
|
try:
|
||||||
|
dt = datetime.fromisoformat(s)
|
||||||
|
except ValueError:
|
||||||
|
dt = None
|
||||||
|
if dt is None:
|
||||||
|
# Common classifier-emitted formats, in rough frequency order.
|
||||||
|
formats = [
|
||||||
|
"%a %Y-%m-%d %H:%M:%S", # Sun 2024-05-12 07:00:00
|
||||||
|
"%A %Y-%m-%d %H:%M:%S", # Sunday 2024-05-12 07:00:00
|
||||||
|
"%Y-%m-%d %H:%M:%S", # 2024-05-12 07:00:00
|
||||||
|
"%Y-%m-%d %H:%M", # 2024-05-12 07:00
|
||||||
|
"%Y-%m-%d", # 2024-05-12 (date only)
|
||||||
|
"%a %b %d %Y %H:%M:%S", # Sun May 12 2024 07:00:00
|
||||||
|
"%A, %B %d, %Y %I:%M %p", # Tuesday, May 14, 2024 7:00 AM
|
||||||
|
"%B %d, %Y %I:%M %p", # May 14, 2024 7:00 AM
|
||||||
|
"%a %b %d %H:%M:%S %Y", # Sun May 12 07:00:00 2024 (asctime-ish)
|
||||||
|
]
|
||||||
|
for fmt in formats:
|
||||||
|
try:
|
||||||
|
dt = datetime.strptime(s, fmt)
|
||||||
|
break
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
if dt is None:
|
||||||
|
raise ValueError(f"could not parse {value!r} as a datetime")
|
||||||
|
# Naive datetimes assumed UTC (the v1 model is single-user, single
|
||||||
|
# timezone — keeping it consistent with chat_state.time defaults).
|
||||||
|
if dt.tzinfo is None:
|
||||||
|
dt = dt.replace(tzinfo=timezone.utc)
|
||||||
|
return dt.isoformat(timespec="seconds")
|
||||||
|
|
||||||
|
|
||||||
def _parse_holding(text: str) -> list[str]:
|
def _parse_holding(text: str) -> list[str]:
|
||||||
@@ -157,11 +239,13 @@ async def kickoff_post(
|
|||||||
if bot is None:
|
if bot is None:
|
||||||
raise HTTPException(status_code=404, detail=f"bot not found: {bot_id}")
|
raise HTTPException(status_code=404, detail=f"bot not found: {bot_id}")
|
||||||
|
|
||||||
# Loose ISO 8601 validation. ``datetime.fromisoformat`` accepts the offset
|
# Permissive datetime parsing — the classifier emits a variety of
|
||||||
# form ``2026-04-26T20:00:00+00:00`` we use; reject anything it can't parse.
|
# human-readable formats ("Sun 2024-05-12 07:00:00",
|
||||||
|
# "Tuesday, May 14, 2024 7:00 AM", proper ISO, etc.). We coerce
|
||||||
|
# to canonical ISO and only 400 if NOTHING parses.
|
||||||
if initial_time_iso.strip():
|
if initial_time_iso.strip():
|
||||||
try:
|
try:
|
||||||
datetime.fromisoformat(initial_time_iso.strip())
|
initial_time_iso = _coerce_iso_time(initial_time_iso)
|
||||||
except ValueError:
|
except ValueError:
|
||||||
raise HTTPException(
|
raise HTTPException(
|
||||||
status_code=400,
|
status_code=400,
|
||||||
@@ -178,8 +262,14 @@ async def kickoff_post(
|
|||||||
).fetchone()
|
).fetchone()
|
||||||
container_id = next_container_row[0]
|
container_id = next_container_row[0]
|
||||||
|
|
||||||
|
# Use ``append_and_apply`` per event (live-path pattern) rather than
|
||||||
|
# appending all-then-project. ``project()`` replays the *entire*
|
||||||
|
# event log; non-idempotent handlers like ``_apply_chat_created``
|
||||||
|
# (raw INSERT into chats) then 500 with UNIQUE constraint failures
|
||||||
|
# for any chats that already exist from prior kickoffs.
|
||||||
|
|
||||||
# 1. chat_created
|
# 1. chat_created
|
||||||
append_event(
|
append_and_apply(
|
||||||
conn,
|
conn,
|
||||||
kind="chat_created",
|
kind="chat_created",
|
||||||
payload={
|
payload={
|
||||||
@@ -192,7 +282,7 @@ async def kickoff_post(
|
|||||||
)
|
)
|
||||||
|
|
||||||
# 2. container_created
|
# 2. container_created
|
||||||
append_event(
|
append_and_apply(
|
||||||
conn,
|
conn,
|
||||||
kind="container_created",
|
kind="container_created",
|
||||||
payload={
|
payload={
|
||||||
@@ -208,7 +298,7 @@ async def kickoff_post(
|
|||||||
bot_interruptible = bool(bot_activity_action_interruptible)
|
bot_interruptible = bool(bot_activity_action_interruptible)
|
||||||
|
|
||||||
# 3. activity_change for "you"
|
# 3. activity_change for "you"
|
||||||
append_event(
|
append_and_apply(
|
||||||
conn,
|
conn,
|
||||||
kind="activity_change",
|
kind="activity_change",
|
||||||
payload={
|
payload={
|
||||||
@@ -229,7 +319,7 @@ async def kickoff_post(
|
|||||||
)
|
)
|
||||||
|
|
||||||
# 4. activity_change for bot
|
# 4. activity_change for bot
|
||||||
append_event(
|
append_and_apply(
|
||||||
conn,
|
conn,
|
||||||
kind="activity_change",
|
kind="activity_change",
|
||||||
payload={
|
payload={
|
||||||
@@ -250,7 +340,7 @@ async def kickoff_post(
|
|||||||
)
|
)
|
||||||
|
|
||||||
# 5. scene_opened
|
# 5. scene_opened
|
||||||
append_event(
|
append_and_apply(
|
||||||
conn,
|
conn,
|
||||||
kind="scene_opened",
|
kind="scene_opened",
|
||||||
payload={
|
payload={
|
||||||
@@ -267,7 +357,7 @@ async def kickoff_post(
|
|||||||
facts = _parse_facts(edge_seed_knowledge_facts)
|
facts = _parse_facts(edge_seed_knowledge_facts)
|
||||||
if edge_seed_summary.strip():
|
if edge_seed_summary.strip():
|
||||||
facts.insert(0, f"[summary] {edge_seed_summary.strip()}")
|
facts.insert(0, f"[summary] {edge_seed_summary.strip()}")
|
||||||
append_event(
|
append_and_apply(
|
||||||
conn,
|
conn,
|
||||||
kind="edge_update",
|
kind="edge_update",
|
||||||
payload={
|
payload={
|
||||||
@@ -278,9 +368,4 @@ async def kickoff_post(
|
|||||||
},
|
},
|
||||||
)
|
)
|
||||||
|
|
||||||
# Project all events at once. ``bot_authored`` (already in log from prior
|
|
||||||
# POST) is idempotent (INSERT OR REPLACE); the new events project cleanly
|
|
||||||
# because they're being applied for the first time.
|
|
||||||
project(conn)
|
|
||||||
|
|
||||||
return RedirectResponse(url=f"/chats/{chat_id}", status_code=303)
|
return RedirectResponse(url=f"/chats/{chat_id}", status_code=303)
|
||||||
|
|||||||
+10
-1
@@ -71,18 +71,27 @@ def _read_recent_meanwhile_dialogue(
|
|||||||
that already match — avoids an unbounded scan as ``event_log``
|
that already match — avoids an unbounded scan as ``event_log``
|
||||||
grows. The user-side rows match on chat_id only since they aren't
|
grows. The user-side rows match on chat_id only since they aren't
|
||||||
tagged with a scene id (they ride the chat-wide log).
|
tagged with a scene id (they ride the chat-wide log).
|
||||||
|
|
||||||
|
T113: clamp by the active branch's ``[origin, head]`` event-id range
|
||||||
|
so meanwhile prompt context respects the user's current branch.
|
||||||
|
Bootstrap-main and "no active branch" both fall through to ``(0,
|
||||||
|
BIG_INT)`` — no functional change for the metadata-only Phase 4 era.
|
||||||
"""
|
"""
|
||||||
|
from chat.state.branches import active_branch_event_ids
|
||||||
|
|
||||||
|
origin, head = active_branch_event_ids(conn)
|
||||||
cur = conn.execute(
|
cur = conn.execute(
|
||||||
"SELECT id, kind, payload_json FROM event_log "
|
"SELECT id, kind, payload_json FROM event_log "
|
||||||
"WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
|
"WHERE kind IN ('user_turn', 'user_turn_edit', 'assistant_turn') "
|
||||||
" AND superseded_by IS NULL AND hidden = 0 "
|
" AND superseded_by IS NULL AND hidden = 0 "
|
||||||
|
" AND id BETWEEN ? AND ? "
|
||||||
" AND json_extract(payload_json, '$.chat_id') = ? "
|
" AND json_extract(payload_json, '$.chat_id') = ? "
|
||||||
" AND ("
|
" AND ("
|
||||||
" kind IN ('user_turn', 'user_turn_edit') "
|
" kind IN ('user_turn', 'user_turn_edit') "
|
||||||
" OR json_extract(payload_json, '$.meanwhile_scene_id') = ?"
|
" OR json_extract(payload_json, '$.meanwhile_scene_id') = ?"
|
||||||
" ) "
|
" ) "
|
||||||
"ORDER BY id DESC LIMIT ?",
|
"ORDER BY id DESC LIMIT ?",
|
||||||
(chat_id, scene_id, limit),
|
(origin, head, chat_id, scene_id, limit),
|
||||||
)
|
)
|
||||||
rows = cur.fetchall()
|
rows = cur.fetchall()
|
||||||
rows.reverse()
|
rows.reverse()
|
||||||
|
|||||||
+148
-9
@@ -14,6 +14,12 @@ For each match we hydrate just enough metadata to render a row:
|
|||||||
* the originating scene title when one exists,
|
* the originating scene title when one exists,
|
||||||
* and the ``pov_summary`` itself.
|
* and the ``pov_summary`` itself.
|
||||||
|
|
||||||
|
T106 (Phase 4.5): hydration is batched. Pre-T106 the route called
|
||||||
|
``get_bot``/``get_chat``/``get_scene`` once per result row — N+1 with
|
||||||
|
``DEFAULT_SEARCH_K=50`` meaning up to 150 individual SELECTs per page
|
||||||
|
load. We now collect distinct ids first and fan-in via three
|
||||||
|
``WHERE id IN (...)`` queries, then map back per row.
|
||||||
|
|
||||||
We deliberately keep this module synchronous and template-only — no
|
We deliberately keep this module synchronous and template-only — no
|
||||||
HTMX swaps, no JSON API — because the search box is a "leave the
|
HTMX swaps, no JSON API — because the search box is a "leave the
|
||||||
current chat to look something up" surface, not an inline drawer.
|
current chat to look something up" surface, not an inline drawer.
|
||||||
@@ -21,7 +27,9 @@ current chat to look something up" surface, not an inline drawer.
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
from sqlite3 import Connection
|
||||||
|
|
||||||
from fastapi import APIRouter, Depends, Request
|
from fastapi import APIRouter, Depends, Request
|
||||||
from fastapi.responses import HTMLResponse
|
from fastapi.responses import HTMLResponse
|
||||||
@@ -36,29 +44,145 @@ TEMPLATES = Jinja2Templates(
|
|||||||
directory=str(Path(__file__).resolve().parent.parent / "templates")
|
directory=str(Path(__file__).resolve().parent.parent / "templates")
|
||||||
)
|
)
|
||||||
|
|
||||||
|
#: Maximum cross-chat FTS matches surfaced per ``/search`` page load.
|
||||||
|
#: Extracted as a module-level constant (T106) so the cap is tunable
|
||||||
|
#: without touching the route body. ``search_all_memories`` itself
|
||||||
|
#: defaults to a smaller ``k=20``; we override here because the
|
||||||
|
#: top-bar search is a "scan everything I've seen" surface, not an
|
||||||
|
#: inline drawer.
|
||||||
|
DEFAULT_SEARCH_K = 50
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
|
|
||||||
|
|
||||||
|
def _fetch_bots_by_ids(conn: Connection, ids: set[str]) -> dict[str, dict]:
|
||||||
|
"""Batched sibling of :func:`chat.state.entities.get_bot`.
|
||||||
|
|
||||||
|
Inlined here (not exported from ``state.entities``) to keep T106's
|
||||||
|
scope confined to ``search.py`` per the Phase 4.5 plan. Returns
|
||||||
|
``{bot_id: bot_dict}`` for every id present in ``ids``; ids with
|
||||||
|
no matching row are simply absent from the map (the caller falls
|
||||||
|
back to the raw id string the same way it did pre-T106).
|
||||||
|
|
||||||
|
Empty ``ids`` short-circuits to ``{}`` because SQLite rejects
|
||||||
|
``WHERE id IN ()`` as a syntax error.
|
||||||
|
"""
|
||||||
|
if not ids:
|
||||||
|
return {}
|
||||||
|
placeholders = ",".join("?" * len(ids))
|
||||||
|
cols = [c[1] for c in conn.execute("PRAGMA table_info(bots)").fetchall()]
|
||||||
|
rows = conn.execute(
|
||||||
|
f"SELECT * FROM bots WHERE id IN ({placeholders})",
|
||||||
|
tuple(ids),
|
||||||
|
).fetchall()
|
||||||
|
out: dict[str, dict] = {}
|
||||||
|
for row in rows:
|
||||||
|
d = dict(zip(cols, row))
|
||||||
|
d["voice_samples"] = json.loads(d.pop("voice_samples_json"))
|
||||||
|
d["traits"] = json.loads(d.pop("traits_json"))
|
||||||
|
out[d["id"]] = d
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def _fetch_chats_by_ids(conn: Connection, ids: set[str]) -> dict[str, dict]:
|
||||||
|
"""Batched sibling of :func:`chat.state.world.get_chat`.
|
||||||
|
|
||||||
|
Mirrors that helper's ``chats``/``chat_state`` JOIN so the returned
|
||||||
|
dicts have the same shape (``narrative_anchor``, ``time``,
|
||||||
|
``weather``, ``active_scene_id``, etc.). Empty ``ids`` returns
|
||||||
|
``{}`` to dodge the ``IN ()`` syntax error.
|
||||||
|
"""
|
||||||
|
if not ids:
|
||||||
|
return {}
|
||||||
|
placeholders = ",".join("?" * len(ids))
|
||||||
|
rows = conn.execute(
|
||||||
|
"SELECT c.id, c.host_bot_id, c.guest_bot_id, c.created_at, "
|
||||||
|
" s.time, s.weather, s.active_scene_id, s.narrative_anchor "
|
||||||
|
f"FROM chats c JOIN chat_state s ON s.chat_id = c.id "
|
||||||
|
f"WHERE c.id IN ({placeholders})",
|
||||||
|
tuple(ids),
|
||||||
|
).fetchall()
|
||||||
|
return {
|
||||||
|
row[0]: {
|
||||||
|
"id": row[0],
|
||||||
|
"host_bot_id": row[1],
|
||||||
|
"guest_bot_id": row[2],
|
||||||
|
"created_at": row[3],
|
||||||
|
"time": row[4],
|
||||||
|
"weather": row[5],
|
||||||
|
"active_scene_id": row[6],
|
||||||
|
"narrative_anchor": row[7],
|
||||||
|
}
|
||||||
|
for row in rows
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _fetch_scenes_by_ids(conn: Connection, ids: set[int]) -> dict[int, dict]:
|
||||||
|
"""Batched sibling of :func:`chat.state.world.get_scene`.
|
||||||
|
|
||||||
|
Returns ``{scene_id: scene_dict}`` with ``participants`` already
|
||||||
|
JSON-decoded so callers see the same shape as the per-row helper.
|
||||||
|
Empty ``ids`` returns ``{}``.
|
||||||
|
"""
|
||||||
|
if not ids:
|
||||||
|
return {}
|
||||||
|
placeholders = ",".join("?" * len(ids))
|
||||||
|
cols = [c[1] for c in conn.execute("PRAGMA table_info(scenes)").fetchall()]
|
||||||
|
rows = conn.execute(
|
||||||
|
f"SELECT * FROM scenes WHERE id IN ({placeholders})",
|
||||||
|
tuple(ids),
|
||||||
|
).fetchall()
|
||||||
|
out: dict[int, dict] = {}
|
||||||
|
for row in rows:
|
||||||
|
d = dict(zip(cols, row))
|
||||||
|
d["participants"] = json.loads(d.pop("participants_json"))
|
||||||
|
out[d["id"]] = d
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
@router.get("/search", response_class=HTMLResponse)
|
@router.get("/search", response_class=HTMLResponse)
|
||||||
async def search(request: Request, q: str = "", conn=Depends(get_conn)):
|
async def search(request: Request, q: str = "", conn=Depends(get_conn)):
|
||||||
"""Render ``search.html`` with up to 50 cross-chat FTS matches.
|
"""Render ``search.html`` with up to :data:`DEFAULT_SEARCH_K` matches.
|
||||||
|
|
||||||
``q`` is intentionally allowed to be empty — that path renders the
|
``q`` is intentionally allowed to be empty — that path renders the
|
||||||
page's "enter a query" placeholder rather than a 400, because the
|
page's "enter a query" placeholder rather than a 400, because the
|
||||||
top-bar form submits to this URL even with an empty input. T93's
|
top-bar form submits to this URL even with an empty input. T93's
|
||||||
service short-circuits whitespace-only queries to ``[]`` so there
|
service short-circuits whitespace-only queries to ``[]`` so there
|
||||||
is no FTS5 ``MATCH ''`` syntax error to guard against here.
|
is no FTS5 ``MATCH ''`` syntax error to guard against here.
|
||||||
"""
|
|
||||||
raw_results = search_all_memories(conn, query=q, k=50) if q else []
|
|
||||||
|
|
||||||
# Hydrate display fields per row. We do this in the route (not the
|
Hydration (T106) is batched: rather than calling ``get_bot`` /
|
||||||
# service) so the service stays a pure FTS shim that other UIs
|
``get_chat`` / ``get_scene`` per row (worst case 3 * k individual
|
||||||
# can reuse.
|
SELECTs), we collect distinct ids and issue one ``IN (...)`` query
|
||||||
|
per entity kind, then map back during the row build. ``get_bot``
|
||||||
|
et al. remain imported for test-time monkeypatching but are no
|
||||||
|
longer invoked on the hot path.
|
||||||
|
"""
|
||||||
|
raw_results = (
|
||||||
|
search_all_memories(conn, query=q, k=DEFAULT_SEARCH_K) if q else []
|
||||||
|
)
|
||||||
|
|
||||||
|
# Collect distinct ids up front so the IN-list queries dedupe (a
|
||||||
|
# popular bot or scene shows up many times across the result set).
|
||||||
|
bot_ids: set[str] = {r["owner_id"] for r in raw_results if r["owner_id"]}
|
||||||
|
chat_ids: set[str] = {r["chat_id"] for r in raw_results if r["chat_id"]}
|
||||||
|
scene_ids: set[int] = {
|
||||||
|
r["scene_id"] for r in raw_results if r["scene_id"]
|
||||||
|
}
|
||||||
|
|
||||||
|
bots_by_id = _fetch_bots_by_ids(conn, bot_ids)
|
||||||
|
chats_by_id = _fetch_chats_by_ids(conn, chat_ids)
|
||||||
|
scenes_by_id = _fetch_scenes_by_ids(conn, scene_ids)
|
||||||
|
|
||||||
|
# Hydrate display fields per row from the batched maps. We do this
|
||||||
|
# in the route (not the service) so the service stays a pure FTS
|
||||||
|
# shim that other UIs can reuse.
|
||||||
results = []
|
results = []
|
||||||
for row in raw_results:
|
for row in raw_results:
|
||||||
bot = get_bot(conn, row["owner_id"])
|
bot = bots_by_id.get(row["owner_id"])
|
||||||
chat = get_chat(conn, row["chat_id"])
|
chat = chats_by_id.get(row["chat_id"])
|
||||||
scene = get_scene(conn, row["scene_id"]) if row["scene_id"] else None
|
scene = (
|
||||||
|
scenes_by_id.get(row["scene_id"]) if row["scene_id"] else None
|
||||||
|
)
|
||||||
results.append(
|
results.append(
|
||||||
{
|
{
|
||||||
"memory_id": row["memory_id"],
|
"memory_id": row["memory_id"],
|
||||||
@@ -69,6 +193,13 @@ async def search(request: Request, q: str = "", conn=Depends(get_conn)):
|
|||||||
chat.get("narrative_anchor") if chat else None
|
chat.get("narrative_anchor") if chat else None
|
||||||
),
|
),
|
||||||
"scene_id": row["scene_id"],
|
"scene_id": row["scene_id"],
|
||||||
|
# T111.2: event_id deep-links to the originating turn
|
||||||
|
# via the ``id="turn-{event_id}"`` anchor that Phase 3.5
|
||||||
|
# T86 stamps on each turn DOM node. May be ``None`` for
|
||||||
|
# memory rows projected before the 0014 migration ran
|
||||||
|
# (T109 did not backfill historical rows); the template
|
||||||
|
# falls back to a chat-level link in that case.
|
||||||
|
"event_id": row["event_id"],
|
||||||
# Scenes have no ``title`` column today; surface the
|
# Scenes have no ``title`` column today; surface the
|
||||||
# ``started_at`` timestamp as a human-friendly label
|
# ``started_at`` timestamp as a human-friendly label
|
||||||
# when a scene is set, otherwise leave it blank.
|
# when a scene is set, otherwise leave it blank.
|
||||||
@@ -76,6 +207,14 @@ async def search(request: Request, q: str = "", conn=Depends(get_conn)):
|
|||||||
scene.get("started_at") if scene else None
|
scene.get("started_at") if scene else None
|
||||||
),
|
),
|
||||||
"pov_summary": row["pov_summary"],
|
"pov_summary": row["pov_summary"],
|
||||||
|
# T111.1: ``snippet`` is the FTS5 windowed excerpt with
|
||||||
|
# ``<mark>`` tags around each match. Falls back to the
|
||||||
|
# full ``pov_summary`` if the row lacks a snippet (which
|
||||||
|
# shouldn't happen on this code path because every
|
||||||
|
# ``raw_results`` row came from a MATCH query, but we
|
||||||
|
# guard defensively so the template never renders
|
||||||
|
# ``None``).
|
||||||
|
"snippet": row.get("snippet") or row["pov_summary"],
|
||||||
"significance": row["significance"],
|
"significance": row["significance"],
|
||||||
"ts": row["ts"],
|
"ts": row["ts"],
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -4,8 +4,7 @@ from fastapi import APIRouter, Depends, Form, HTTPException, Request
|
|||||||
from fastapi.responses import HTMLResponse
|
from fastapi.responses import HTMLResponse
|
||||||
from fastapi.templating import Jinja2Templates
|
from fastapi.templating import Jinja2Templates
|
||||||
|
|
||||||
from chat.eventlog.log import append_event
|
from chat.eventlog.log import append_and_apply
|
||||||
from chat.eventlog.projector import project
|
|
||||||
from chat.state.entities import get_you
|
from chat.state.entities import get_you
|
||||||
from chat.web.bots import get_conn
|
from chat.web.bots import get_conn
|
||||||
|
|
||||||
@@ -40,8 +39,10 @@ async def settings_post(
|
|||||||
"pronouns": pronouns.strip(),
|
"pronouns": pronouns.strip(),
|
||||||
"persona": persona.strip(),
|
"persona": persona.strip(),
|
||||||
}
|
}
|
||||||
append_event(conn, kind="you_authored", payload=payload)
|
# Per-event apply (NOT project()) — see docs/audits/2026-04-27-project-callers.md.
|
||||||
project(conn)
|
# ``project()`` replays the full log, which trips raw-INSERT handlers like
|
||||||
|
# ``_apply_chat_created`` once chat events are present.
|
||||||
|
append_and_apply(conn, kind="you_authored", payload=payload)
|
||||||
|
|
||||||
return TEMPLATES.TemplateResponse(
|
return TEMPLATES.TemplateResponse(
|
||||||
request,
|
request,
|
||||||
|
|||||||
+35
-9
@@ -8,20 +8,27 @@ Routes:
|
|||||||
|
|
||||||
* ``GET /snapshots`` list all snapshots (both kinds)
|
* ``GET /snapshots`` list all snapshots (both kinds)
|
||||||
* ``POST /snapshots/take`` take a periodic snapshot now
|
* ``POST /snapshots/take`` take a periodic snapshot now
|
||||||
* ``POST /snapshots/restore/{id}`` restore (requires matching ``confirm_id``)
|
* ``POST /snapshots/restore/{id}`` restore (requires matching ``confirm_id`` and ``kind``)
|
||||||
* ``GET /snapshots/{id}/preview`` show metadata + delta vs current
|
* ``GET /snapshots/{id}/preview`` show metadata + delta vs current
|
||||||
|
|
||||||
The ``snapshot_id`` is the filename stem (the UTC timestamp written by
|
The ``snapshot_id`` is the filename stem (the UTC timestamp written by
|
||||||
:func:`chat.services.snapshot.take_snapshot`) — there's no separate UUID,
|
:func:`chat.services.snapshot.take_snapshot`) — there's no separate UUID,
|
||||||
and the timestamp filename is already unique per snapshot kind. Both
|
and the timestamp filename is already unique per snapshot kind. Both
|
||||||
periodic and rewind snapshots share the same id space lookup-wise, so
|
periodic and rewind snapshots share the same id space lookup-wise, so
|
||||||
the restore + preview routes accept ``kind`` as a form/query param to
|
the restore + preview routes require ``kind`` as a form/query param to
|
||||||
disambiguate.
|
disambiguate (a missing/empty ``kind`` is a 400, not a silent default).
|
||||||
|
|
||||||
|
Note on ``created_at`` mtime drift: the listing's ``created_at`` comes
|
||||||
|
from the file's mtime, not the encoded filename timestamp. ``cp -p``
|
||||||
|
preserves mtime, but plain ``cp`` resets it to "now" — so a copied
|
||||||
|
snapshot can show a misleading ``created_at`` while its filename still
|
||||||
|
reflects the original UTC capture time.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import json
|
import json
|
||||||
|
from datetime import datetime, timezone
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from fastapi import APIRouter, Depends, Form, HTTPException, Request
|
from fastapi import APIRouter, Depends, Form, HTTPException, Request
|
||||||
@@ -52,8 +59,6 @@ def _list_all_snapshots(data_dir: Path) -> list[dict]:
|
|||||||
``last_event_id`` (parsed from the JSON body — small enough that
|
``last_event_id`` (parsed from the JSON body — small enough that
|
||||||
listing isn't a performance concern for the handful of files we keep).
|
listing isn't a performance concern for the handful of files we keep).
|
||||||
"""
|
"""
|
||||||
from datetime import datetime, timezone
|
|
||||||
|
|
||||||
rows: list[dict] = []
|
rows: list[dict] = []
|
||||||
for kind in SNAPSHOT_KINDS:
|
for kind in SNAPSHOT_KINDS:
|
||||||
snap_dir = data_dir / "snapshots" / kind
|
snap_dir = data_dir / "snapshots" / kind
|
||||||
@@ -85,12 +90,26 @@ def _list_all_snapshots(data_dir: Path) -> list[dict]:
|
|||||||
return rows
|
return rows
|
||||||
|
|
||||||
|
|
||||||
|
def _require_kind(kind: str) -> str:
|
||||||
|
"""Reject missing/empty/unknown ``kind`` with 400.
|
||||||
|
|
||||||
|
Defaulting silently to ``"periodic"`` made rewind-snapshot lookups
|
||||||
|
appear as 404s, which is confusing — make the client always state
|
||||||
|
the kind explicitly.
|
||||||
|
"""
|
||||||
|
if not kind or kind not in SNAPSHOT_KINDS:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=400,
|
||||||
|
detail=f"kind must be one of {SNAPSHOT_KINDS}",
|
||||||
|
)
|
||||||
|
return kind
|
||||||
|
|
||||||
|
|
||||||
def _resolve_snapshot_path(
|
def _resolve_snapshot_path(
|
||||||
data_dir: Path, snapshot_id: str, kind: str
|
data_dir: Path, snapshot_id: str, kind: str
|
||||||
) -> Path:
|
) -> Path:
|
||||||
"""Map an ``(id, kind)`` pair to the on-disk file, or 404."""
|
"""Map an ``(id, kind)`` pair to the on-disk file, or 404."""
|
||||||
if kind not in SNAPSHOT_KINDS:
|
_require_kind(kind)
|
||||||
raise HTTPException(status_code=400, detail=f"unknown kind: {kind}")
|
|
||||||
path = data_dir / "snapshots" / kind / f"{snapshot_id}.json"
|
path = data_dir / "snapshots" / kind / f"{snapshot_id}.json"
|
||||||
if not path.exists():
|
if not path.exists():
|
||||||
raise HTTPException(status_code=404, detail="snapshot not found")
|
raise HTTPException(status_code=404, detail="snapshot not found")
|
||||||
@@ -127,7 +146,7 @@ async def snapshots_restore(
|
|||||||
snapshot_id: str,
|
snapshot_id: str,
|
||||||
request: Request,
|
request: Request,
|
||||||
confirm_id: str = Form(""),
|
confirm_id: str = Form(""),
|
||||||
kind: str = Form("periodic"),
|
kind: str = Form(""),
|
||||||
conn=Depends(get_conn),
|
conn=Depends(get_conn),
|
||||||
):
|
):
|
||||||
"""Hard-confirm restore: ``confirm_id`` must equal the path id.
|
"""Hard-confirm restore: ``confirm_id`` must equal the path id.
|
||||||
@@ -135,7 +154,11 @@ async def snapshots_restore(
|
|||||||
Mismatched confirm → 400 (without touching the DB). On match, the
|
Mismatched confirm → 400 (without touching the DB). On match, the
|
||||||
existing :func:`restore_from_snapshot` clears projected tables and
|
existing :func:`restore_from_snapshot` clears projected tables and
|
||||||
re-loads them from the dump.
|
re-loads them from the dump.
|
||||||
|
|
||||||
|
``kind`` is required (must be ``"periodic"`` or ``"rewind"``) — a
|
||||||
|
missing or empty value 400s rather than silently defaulting.
|
||||||
"""
|
"""
|
||||||
|
_require_kind(kind)
|
||||||
if confirm_id != snapshot_id:
|
if confirm_id != snapshot_id:
|
||||||
raise HTTPException(
|
raise HTTPException(
|
||||||
status_code=400,
|
status_code=400,
|
||||||
@@ -151,7 +174,7 @@ async def snapshots_restore(
|
|||||||
async def snapshots_preview(
|
async def snapshots_preview(
|
||||||
snapshot_id: str,
|
snapshot_id: str,
|
||||||
request: Request,
|
request: Request,
|
||||||
kind: str = "periodic",
|
kind: str = "",
|
||||||
conn=Depends(get_conn),
|
conn=Depends(get_conn),
|
||||||
):
|
):
|
||||||
"""Show snapshot metadata + a basic delta against the current event log.
|
"""Show snapshot metadata + a basic delta against the current event log.
|
||||||
@@ -159,7 +182,10 @@ async def snapshots_preview(
|
|||||||
Phase 4 keeps this simple: the snapshot's ``last_event_id`` plus the
|
Phase 4 keeps this simple: the snapshot's ``last_event_id`` plus the
|
||||||
current ``MAX(event_log.id)`` is enough to tell the user how far the
|
current ``MAX(event_log.id)`` is enough to tell the user how far the
|
||||||
log has moved on. A richer per-table diff is a Phase 4.5+ concern.
|
log has moved on. A richer per-table diff is a Phase 4.5+ concern.
|
||||||
|
|
||||||
|
``kind`` is required — see :func:`snapshots_restore`.
|
||||||
"""
|
"""
|
||||||
|
_require_kind(kind)
|
||||||
settings = request.app.state.settings
|
settings = request.app.state.settings
|
||||||
path = _resolve_snapshot_path(settings.data_dir, snapshot_id, kind)
|
path = _resolve_snapshot_path(settings.data_dir, snapshot_id, kind)
|
||||||
dump = json.loads(path.read_text())
|
dump = json.loads(path.read_text())
|
||||||
|
|||||||
@@ -67,6 +67,7 @@ from chat.services.multi_state_update import compute_state_updates_for_present
|
|||||||
from chat.services.prompt import (
|
from chat.services.prompt import (
|
||||||
assemble_narrative_prompt,
|
assemble_narrative_prompt,
|
||||||
consume_pending_meanwhile_digests,
|
consume_pending_meanwhile_digests,
|
||||||
|
trim_to_max_beats,
|
||||||
)
|
)
|
||||||
from chat.services.rewind import compute_rewind_preview, execute_rewind
|
from chat.services.rewind import compute_rewind_preview, execute_rewind
|
||||||
from chat.services.scene_close import detect_scene_close
|
from chat.services.scene_close import detect_scene_close
|
||||||
@@ -482,6 +483,11 @@ async def post_turn(
|
|||||||
_in_flight_tasks.pop(chat_id, None)
|
_in_flight_tasks.pop(chat_id, None)
|
||||||
|
|
||||||
primary_text = "".join(primary_accumulated)
|
primary_text = "".join(primary_accumulated)
|
||||||
|
# Belt-and-suspenders: trim to 3 beats max even if the model
|
||||||
|
# ignored the "HARD CAP: 2-3 beats" prompt instruction. Roleplay-
|
||||||
|
# tuned narrators are reliably verbose; a physical max_tokens
|
||||||
|
# truncates mid-word, this trims at a beat boundary.
|
||||||
|
primary_text = trim_to_max_beats(primary_text, max_beats=3)
|
||||||
|
|
||||||
# 7. Append the assistant_turn with the final text. (See note above on
|
# 7. Append the assistant_turn with the final text. (See note above on
|
||||||
# why we skip ``project`` for these transcript-only event kinds.)
|
# why we skip ``project`` for these transcript-only event kinds.)
|
||||||
@@ -677,6 +683,10 @@ async def post_turn(
|
|||||||
_in_flight_tasks.pop(chat_id, None)
|
_in_flight_tasks.pop(chat_id, None)
|
||||||
|
|
||||||
interjection_text = "".join(interject_accumulated)
|
interjection_text = "".join(interject_accumulated)
|
||||||
|
# Same beat-cap as the primary turn — interjections are
|
||||||
|
# by definition short, but Cydonia-class narrators ignore
|
||||||
|
# that. 2 beats is plenty for a chime-in.
|
||||||
|
interjection_text = trim_to_max_beats(interjection_text, max_beats=2)
|
||||||
|
|
||||||
# Capture the event id (T86 follow-up) so the SSE fragment
|
# Capture the event id (T86 follow-up) so the SSE fragment
|
||||||
# below carries ``id="turn-<n>"`` for in-place swap.
|
# below carries ``id="turn-<n>"`` for in-place swap.
|
||||||
@@ -812,6 +822,14 @@ async def post_turn(
|
|||||||
payload={
|
payload={
|
||||||
"event_id": transition.event_id,
|
"event_id": transition.event_id,
|
||||||
"started_at": chat.get("time"),
|
"started_at": chat.get("time"),
|
||||||
|
# T114.1: back-reference to the assistant_turn that
|
||||||
|
# triggered this transition. Regenerate uses this
|
||||||
|
# to roll back lifecycle transitions when the turn
|
||||||
|
# is superseded. Forward-only — older events
|
||||||
|
# without this field are skipped by rollback.
|
||||||
|
"triggered_by_assistant_turn_id": (
|
||||||
|
primary_assistant_event_id
|
||||||
|
),
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
elif transition.new_status == "completed":
|
elif transition.new_status == "completed":
|
||||||
@@ -821,6 +839,10 @@ async def post_turn(
|
|||||||
payload={
|
payload={
|
||||||
"event_id": transition.event_id,
|
"event_id": transition.event_id,
|
||||||
"completed_at": chat.get("time"),
|
"completed_at": chat.get("time"),
|
||||||
|
# T114.1: back-reference (see above).
|
||||||
|
"triggered_by_assistant_turn_id": (
|
||||||
|
primary_assistant_event_id
|
||||||
|
),
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
# Run promotion inline so the artifact-emitting events
|
# Run promotion inline so the artifact-emitting events
|
||||||
@@ -842,6 +864,10 @@ async def post_turn(
|
|||||||
payload={
|
payload={
|
||||||
"event_id": transition.event_id,
|
"event_id": transition.event_id,
|
||||||
"completed_at": chat.get("time"),
|
"completed_at": chat.get("time"),
|
||||||
|
# T114.1: back-reference (see above).
|
||||||
|
"triggered_by_assistant_turn_id": (
|
||||||
|
primary_assistant_event_id
|
||||||
|
),
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
# Any other ``new_status`` value falls through silently —
|
# Any other ``new_status`` value falls through silently —
|
||||||
@@ -873,6 +899,20 @@ async def post_turn(
|
|||||||
# mid-stream still meant to close the scene — the cancelled bot
|
# mid-stream still meant to close the scene — the cancelled bot
|
||||||
# beat doesn't invalidate that intent. Pinned by
|
# beat doesn't invalidate that intent. Pinned by
|
||||||
# test_cancelled_turn_still_closes_scene_when_user_prose_signals_close.
|
# test_cancelled_turn_still_closes_scene_when_user_prose_signals_close.
|
||||||
|
#
|
||||||
|
# T108 NOTE — the in-memory append order is correct, but the cancel
|
||||||
|
# path re-raises ``CancelledError`` at the end of ``post_turn``
|
||||||
|
# (see step 11 below). The ``open_db`` dependency teardown skips
|
||||||
|
# ``conn.commit()`` when the consumer raises, which means in
|
||||||
|
# production a genuine cancel currently rolls back ALL post-cancel
|
||||||
|
# writes — including this scene_closed event, the truncated
|
||||||
|
# assistant_turn record, edge updates, and per-POV summaries. The
|
||||||
|
# T74.3 regression test passes only because of a missing
|
||||||
|
# ``import asyncio`` in the test module: the inline mock raises
|
||||||
|
# ``NameError`` instead of ``CancelledError``, which is caught by
|
||||||
|
# the ``except Exception:`` branch and leaves ``cancelled=False``,
|
||||||
|
# so the function returns 204 normally and the commit fires. This
|
||||||
|
# is a transactional bug deferred for triage (T108 report).
|
||||||
if scene is not None and prose.strip():
|
if scene is not None and prose.strip():
|
||||||
container = None
|
container = None
|
||||||
if scene.get("container_id") is not None:
|
if scene.get("container_id") is not None:
|
||||||
|
|||||||
@@ -0,0 +1,205 @@
|
|||||||
|
# Audit: `project()` callers and non-idempotent projector handlers
|
||||||
|
|
||||||
|
**Date:** 2026-04-27
|
||||||
|
**Triggering incident:** commit `0f8bf94` — `kickoff_post` 500'd with
|
||||||
|
`sqlite3.IntegrityError: UNIQUE constraint failed: chats.id` after a
|
||||||
|
second bot's kickoff. Root cause: the route appended events with
|
||||||
|
`append_event()` and then called `project(conn)`, which replays the
|
||||||
|
*entire* event log. The `chat_created` handler in `chat/state/world.py`
|
||||||
|
uses raw `INSERT INTO chats ...` (no `OR REPLACE`/`OR IGNORE`), so on a
|
||||||
|
DB that already had a first bot's chat row, the replay re-hit that row
|
||||||
|
and raised.
|
||||||
|
|
||||||
|
This audit walks the rest of the live request paths to make sure no
|
||||||
|
other route has the same shape, and inventories every projector handler
|
||||||
|
that uses raw `INSERT` so the trade-offs are documented for future
|
||||||
|
hardening passes.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 1 — `project()` callers
|
||||||
|
|
||||||
|
`grep -rn "project(" chat/ --include="*.py"` (excluding the definition
|
||||||
|
itself in `chat/eventlog/projector.py:17` and the local `project_id`
|
||||||
|
type variables that the regex doesn't actually catch):
|
||||||
|
|
||||||
|
| File:line | Caller | Classification |
|
||||||
|
|---|---|---|
|
||||||
|
| `chat/web/bots.py:113` | `bot_create` route — append `bot_authored` then `project(conn)` | **Unsafe (live path) — fixed** |
|
||||||
|
| `chat/web/settings.py:44` | `settings_post` route — append `you_authored` then `project(conn)` | **Unsafe (live path) — fixed** |
|
||||||
|
| `chat/services/rewind.py:110` | `execute_rewind` — clears every projected table then re-projects from the truncated log | **Safe (replay-only)** |
|
||||||
|
| `chat/eventlog/projector.py:17` | Definition site, not a call | n/a |
|
||||||
|
| `tests/test_*.py` (~50 tests) | Test setup pattern: append a sequence of synthetic events into a fresh DB, then `project(conn)` to materialise | **Safe (replay-only)** — projects against an empty/fresh DB; not a live request path |
|
||||||
|
|
||||||
|
### Safe (replay-only)
|
||||||
|
|
||||||
|
- **`chat/services/rewind.py:110`** — `execute_rewind` is the canonical
|
||||||
|
"rebuild the projection" entry point. Lines 95–104 explicitly
|
||||||
|
`DELETE FROM` every projected table (`memories`, `activity`, `scenes`,
|
||||||
|
`containers`, `chat_state`, `chats`, `edges`, `bots`, `you_entity`,
|
||||||
|
`classifier_failures`) before calling `project(conn)`. The handler
|
||||||
|
registry then walks the truncated log against empty tables, so even
|
||||||
|
the raw-INSERT handlers run safely on a clean slate. The module
|
||||||
|
docstring (lines 1–21) calls out exactly why a full replay (rather
|
||||||
|
than a "revert delta") is the right move here: the `edge_update`
|
||||||
|
handler is a delta accumulator with no clean inverse. **Do not
|
||||||
|
change.**
|
||||||
|
|
||||||
|
- **Test suite** — every `from chat.eventlog.projector import project`
|
||||||
|
in `tests/` is a setup helper. They open a fresh in-memory or
|
||||||
|
tmp-path DB, append a hand-crafted sequence of events, and call
|
||||||
|
`project(conn)` once. There is no second-replay risk because the DB
|
||||||
|
starts empty. These are not live paths.
|
||||||
|
|
||||||
|
### Unsafe (live-path) — fixed in this audit
|
||||||
|
|
||||||
|
Both fixes follow the pattern established by `0f8bf94`: drop the
|
||||||
|
`append_event` + `project` pair in favour of `append_and_apply` (defined
|
||||||
|
in `chat/eventlog/log.py:32`), which appends and runs *only the
|
||||||
|
brand-new event* through its registered handler.
|
||||||
|
|
||||||
|
- **`chat/web/bots.py:113` — `bot_create`**
|
||||||
|
Was: `append_event(conn, kind="bot_authored", ...); project(conn)`.
|
||||||
|
Now: `append_and_apply(conn, kind="bot_authored", ...)`.
|
||||||
|
In isolation, `_apply_bot_authored` is itself idempotent (`INSERT OR
|
||||||
|
REPLACE INTO bots`), so the *route* didn't fail today. The bug is
|
||||||
|
latent: as soon as any kickoff ran first (which produces
|
||||||
|
`chat_created` events), the next call to `bot_create` would replay
|
||||||
|
that prior `chat_created` and trip the same UNIQUE constraint. We
|
||||||
|
saw this happen in `0f8bf94` — fixing the symmetric route prevents
|
||||||
|
the next variant of the same incident.
|
||||||
|
Removed unused imports: `append_event`, `project`.
|
||||||
|
|
||||||
|
- **`chat/web/settings.py:44` — `settings_post`**
|
||||||
|
Was: `append_event(conn, kind="you_authored", ...); project(conn)`.
|
||||||
|
Now: `append_and_apply(conn, kind="you_authored", ...)`.
|
||||||
|
Same shape as `bot_create`. `_apply_you_authored` is idempotent on
|
||||||
|
its own (`INSERT OR REPLACE INTO you_entity`), but `project()` walks
|
||||||
|
the *whole* log, including any `chat_created` / `container_created`
|
||||||
|
/ `scene_opened` events that have accumulated. Editing the user's
|
||||||
|
own settings on a non-empty DB would 500 with the same UNIQUE
|
||||||
|
constraint error — not because the new event is unsafe, but because
|
||||||
|
the replay is. Fixed by per-event apply.
|
||||||
|
Removed unused imports: `append_event`, `project`.
|
||||||
|
|
||||||
|
### Unsafe — still to fix
|
||||||
|
|
||||||
|
None. The two unsafe live-path call sites identified above were both
|
||||||
|
fixed in this commit. Future hardening: a CI lint that flags
|
||||||
|
`project(` outside `chat/services/rewind.py` and `tests/` would catch a
|
||||||
|
regression, but that's out of scope here.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 2 — non-idempotent projector handler inventory
|
||||||
|
|
||||||
|
Output of `grep -n "INSERT INTO\|INSERT OR REPLACE\|INSERT OR IGNORE"
|
||||||
|
chat/state/*.py`, classified.
|
||||||
|
|
||||||
|
### Replay-safe handlers
|
||||||
|
|
||||||
|
These either use `INSERT OR REPLACE` / `INSERT OR IGNORE` (so a second
|
||||||
|
apply is a no-op or an overwrite of identical data), or are pure
|
||||||
|
`UPDATE` against rows the prior event created.
|
||||||
|
|
||||||
|
| Handler | File | Statement | Why safe |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `_apply_bot_authored` | `chat/state/entities.py:12` | `INSERT OR REPLACE INTO bots` | `id` is the natural PK; replay overwrites with identical payload. |
|
||||||
|
| `_apply_you_authored` | `chat/state/entities.py:29` | `INSERT OR REPLACE INTO you_entity` | Singleton row keyed on `id=1`. |
|
||||||
|
| `_apply_activity_change` | `chat/state/world.py:98` | `INSERT OR REPLACE INTO activity` | Activity is keyed on `entity_id` — last write wins is exactly the intended semantics. |
|
||||||
|
| `_apply_thread_opened` | `chat/state/threads.py:12` | `INSERT OR IGNORE INTO threads` | `thread_id` is the natural PK. |
|
||||||
|
| `_apply_event_planned` | `chat/state/events.py:16` | `INSERT OR IGNORE INTO events` | `event_id` is the natural PK. |
|
||||||
|
| `_apply_branch_created` | `chat/state/branches.py:27` | `INSERT OR IGNORE INTO branches` | Branch `name` is unique. |
|
||||||
|
| `_apply_group_node_initialized` | `chat/state/group_node.py:12` | `INSERT OR REPLACE INTO group_node` | One row per `chat_id`. |
|
||||||
|
| `_apply_embedding_indexed` | `chat/state/embeddings.py:28` | `INSERT OR REPLACE INTO embeddings` | One vector per `memory_id`. |
|
||||||
|
| Pure-`UPDATE` handlers | various — `_apply_time_skip_*`, `_apply_guest_added`/`_removed`, `_apply_scene_closed`, `_apply_memory_significance_set`, `_apply_memory_pin_changed`, `_apply_meanwhile_scene_closed`, `_apply_meanwhile_digest_consumed`, `_apply_thread_updated`, `_apply_event_started`/`_completed`/`_cancelled` (etc.), `_apply_group_node_updated` | n/a | Idempotent: re-applying the same UPDATE produces the same row state. |
|
||||||
|
|
||||||
|
### Unsafe-on-replay handlers (raw `INSERT`)
|
||||||
|
|
||||||
|
| Handler | File | Statement | Failure mode on replay |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `_apply_chat_created` | `chat/state/world.py:14` | `INSERT INTO chats`, `INSERT INTO chat_state` | `chats.id` is PK — second insert raises `IntegrityError: UNIQUE constraint failed: chats.id`. **This is the `0f8bf94` bug.** `chat_state.chat_id` is also unique; would raise too. |
|
||||||
|
| `_apply_container_created` | `chat/state/world.py:78` | `INSERT INTO containers` | `containers.id` is `INTEGER PRIMARY KEY AUTOINCREMENT` — replay does NOT raise (a new id is assigned), but it silently creates a duplicate row, fragmenting downstream lookups by `(chat_id, name)`. **Silent corruption, not a crash.** |
|
||||||
|
| `_apply_scene_opened` | `chat/state/world.py:115` | `INSERT INTO scenes` | Same shape: autoincrement `id`. Replay creates a duplicate scene row and re-points `chat_state.active_scene_id` to the new copy. **Silent corruption.** |
|
||||||
|
| `_apply_memory_written` | `chat/state/memory.py:14` | `INSERT INTO memories` | Autoincrement `id`. Replay duplicates the memory; FTS5 trigger then double-indexes the same `pov_summary`. **Silent corruption + double-counting in retrieval.** |
|
||||||
|
| `_apply_meanwhile_scene_started` | `chat/state/meanwhile.py:29` | `INSERT INTO scenes` (with explicit `scene_id`) | Caller supplies `scene_id` (deterministic). Replay raises `IntegrityError: UNIQUE constraint failed: scenes.id`. **Hard crash, like `chat_created`.** |
|
||||||
|
| `_apply_meanwhile_digest_created` | `chat/state/meanwhile.py:67` | `INSERT INTO meanwhile_digest_pending` | Autoincrement `id`. Replay creates a duplicate pending digest, surfacing the same summary twice in the next you-scene's prompt. **Silent corruption.** |
|
||||||
|
| `_apply_edge_update` | `chat/state/edges.py:12` | `INSERT OR IGNORE INTO edges` followed by `UPDATE … SET affinity = ? + delta` | The `INSERT OR IGNORE` is fine, but the handler is *delta-shaped* — each replay re-adds `affinity_delta` and `trust_delta`, and re-extends `knowledge_json`. **Silent corruption: scores drift up; knowledge facts duplicate.** Already called out in `chat/eventlog/log.py:39-46` as the canonical reason `append_and_apply` exists. |
|
||||||
|
|
||||||
|
### Trade-offs — why we are NOT switching every handler to `INSERT OR REPLACE`
|
||||||
|
|
||||||
|
This is the part the audit is here to nail down before someone "fixes
|
||||||
|
it" with a one-line s/`INSERT INTO`/`INSERT OR REPLACE INTO`/.
|
||||||
|
|
||||||
|
1. **Autoincrement-id handlers (`containers`, `scenes`, `memories`,
|
||||||
|
`meanwhile_digest_pending`)** — `INSERT OR REPLACE` doesn't help.
|
||||||
|
Each event's payload doesn't carry the row's eventual id — the id
|
||||||
|
comes from `lastrowid` *at projection time*. There is no key for
|
||||||
|
`OR REPLACE` to match on. The fix here is either (a) make the event
|
||||||
|
carry a deterministic id derived from the event's own id (large
|
||||||
|
refactor — payload schemas, downstream FK lookups, FTS rowid
|
||||||
|
alignment), or (b) keep the handler raw-INSERT and ensure every
|
||||||
|
live path uses `append_and_apply` (the path we're on). We are on
|
||||||
|
path (b), and this audit makes it explicit.
|
||||||
|
|
||||||
|
2. **`chat_created`** — `chats.id` IS keyed on the natural PK, so
|
||||||
|
`INSERT OR REPLACE INTO chats ...` would technically work for the
|
||||||
|
chat row. *But* it would silently overwrite `chat_state` columns
|
||||||
|
that other events legitimately mutate later: `chat_state.time` is
|
||||||
|
bumped by `time_skip_elision`, `active_scene_id` is set/cleared by
|
||||||
|
`scene_opened`/`scene_closed`. On replay the
|
||||||
|
`chat_created` overwrite would clobber those subsequent updates,
|
||||||
|
then later events would re-set them — *if* the events themselves
|
||||||
|
appear in order (they do today). It would work in practice, but it
|
||||||
|
would erase the invariant that "each handler is responsible for one
|
||||||
|
table-shape change" and make the projector's correctness depend on
|
||||||
|
strict event-order replay through `chat_state`. Not worth the
|
||||||
|
subtle coupling; keep the raw INSERT and treat replay as an
|
||||||
|
explicit "wipe + replay" operation (the rewind path does exactly
|
||||||
|
that).
|
||||||
|
|
||||||
|
3. **`meanwhile_scene_started`** — could be made idempotent (the
|
||||||
|
payload supplies `scene_id`), but it shares the `scenes` table with
|
||||||
|
`_apply_scene_opened` (autoincrement) — making one half of the
|
||||||
|
table writers `OR REPLACE` and the other half raw-INSERT is asking
|
||||||
|
for a future bug. Keep both raw, lean on `append_and_apply`.
|
||||||
|
|
||||||
|
4. **`edge_update`** — fundamentally cannot be made idempotent under
|
||||||
|
replay without either changing the event schema (carry absolute
|
||||||
|
values, not deltas) or recording per-event-id "already applied"
|
||||||
|
flags. Either is a multi-week project. The current contract is
|
||||||
|
"edge_update is a delta event; never apply it twice"; the
|
||||||
|
`append_and_apply` rule enforces that contract from the call site.
|
||||||
|
|
||||||
|
**Conclusion:** the handler layer is *correctly* non-idempotent for
|
||||||
|
event-sourcing semantics. The defect class lives in the *caller* layer
|
||||||
|
(routes that mistakenly call `project()` instead of `append_and_apply`).
|
||||||
|
This audit fixes the two known offenders and pins the contract with a
|
||||||
|
regression test (see Step 3).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 3 — regression test
|
||||||
|
|
||||||
|
Added `tests/test_chat_created_non_idempotent.py`. The test:
|
||||||
|
|
||||||
|
1. Opens a fresh DB and runs the migration chain.
|
||||||
|
2. Appends one `chat_created` event and projects — first projection
|
||||||
|
succeeds.
|
||||||
|
3. Appends a *second* `chat_created` for the same chat id and projects
|
||||||
|
again — asserts that the second projection raises
|
||||||
|
`sqlite3.IntegrityError`.
|
||||||
|
|
||||||
|
The point isn't that the test catches a future "make it idempotent"
|
||||||
|
change automatically; it's that any such change MUST update this test,
|
||||||
|
forcing a deliberate review of all the trade-offs documented above.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Files changed
|
||||||
|
|
||||||
|
- `chat/web/bots.py` — swap `append_event`+`project` → `append_and_apply`,
|
||||||
|
drop unused imports.
|
||||||
|
- `chat/web/settings.py` — same swap.
|
||||||
|
- `tests/test_chat_created_non_idempotent.py` — new regression test.
|
||||||
|
- `docs/audits/2026-04-27-project-callers.md` — this file.
|
||||||
@@ -522,6 +522,8 @@ Written per witness when a scene closes. Different details, different interpreta
|
|||||||
|
|
||||||
**Status: shipped 2026-04-27** (T88–T102, 15 tasks across 8 waves; +70 tests). See "Phase 4 status" in CLAUDE.md for the per-task breakdown. Vector retrieval shipped via pure-Python cosine over a JSON-blob embeddings table (sqlite-vec deferred — host Python lacks loadable extensions); branching is data-model + drawer UI; significance review, hide-from-view soft delete, surgical delete with cascade preview, snapshot UX, and cross-chat search all surface from the drawer or top-bar.
|
**Status: shipped 2026-04-27** (T88–T102, 15 tasks across 8 waves; +70 tests). See "Phase 4 status" in CLAUDE.md for the per-task breakdown. Vector retrieval shipped via pure-Python cosine over a JSON-blob embeddings table (sqlite-vec deferred — host Python lacks loadable extensions); branching is data-model + drawer UI; significance review, hide-from-view soft delete, surgical delete with cascade preview, snapshot UX, and cross-chat search all surface from the drawer or top-bar.
|
||||||
|
|
||||||
|
**Phase 4.5 cleanup: shipped 2026-04-27** (T103–T118, 13 of 14 planned tasks; T115 sqlite-vec swap deferred to Phase 5 due to host Python lacking `enable_load_extension`; +~44 tests; schema baseline now 14). See "Phase 4.5 status" in CLAUDE.md for the per-task breakdown — notable shipped: real embedding model swap path (`LLMClient.embed()` + `--re-embed-all`), branching read-side filter (`active_branch_event_ids`), regenerate lifecycle rollback (`event_status_reverted`), FTS snippet highlighting + deep-link to turn (`memories.event_id`), bulk significance re-rate.
|
||||||
|
|
||||||
- Vector retrieval (sqlite-vss or sqlite-vec).
|
- Vector retrieval (sqlite-vss or sqlite-vec).
|
||||||
- Branching UI.
|
- Branching UI.
|
||||||
- Drawer-edit on every field.
|
- Drawer-edit on every field.
|
||||||
|
|||||||
@@ -0,0 +1,724 @@
|
|||||||
|
# Roleplay Engine — Phase 4.5 Cleanup Plan
|
||||||
|
|
||||||
|
> **For Claude:** REQUIRED SUB-SKILL: Use `superpowers-extended-cc:executing-plans` to implement this plan task-by-task. Use the parallel-dispatch pattern documented under "Parallel-Execution Strategy" for parallel waves.
|
||||||
|
|
||||||
|
**Goal:** Burn down all 24 items in `CLAUDE.md` §"Phase 4.5 / 5 backlog". Mix of small defensive cleanups (most), three big features (real embedding model swap, branching read-side filter, lifecycle rollback in regenerate), one environment-dependent feature (sqlite-vec swap), and the long-deferred carry-overs (scene-close-on-cancel revisit, structured test-fixture builder).
|
||||||
|
|
||||||
|
**Architecture:** No new architecture. Two new schema migrations (0014 schema polish, 0015 sqlite-vec virtual tables). New external dependency optional (`apsw` if Python rebuild isn't possible). All other changes are polish / refactor / observability.
|
||||||
|
|
||||||
|
**Tech Stack:**
|
||||||
|
|
||||||
|
- Existing — same as Phase 4.
|
||||||
|
- **OPTIONAL:** rebuild Python with `--enable-loadable-sqlite-extensions` OR install `apsw` to enable T115 sqlite-vec swap. T115 is the only task that requires this; the other 13 tasks land without it. If neither is available, T115 is deferred to Phase 5.
|
||||||
|
|
||||||
|
**Source-of-truth references:**
|
||||||
|
|
||||||
|
- Backlog: [`CLAUDE.md`](../../CLAUDE.md) §"Phase 4.5 / 5 backlog" (24 items grouped by review source + deferred).
|
||||||
|
- Phase 3.5 / Phase 2.5 cleanup plans (pattern reference): [2026-04-26-v3.5-phase3.5-cleanup.md](2026-04-26-v3.5-phase3.5-cleanup.md), [2026-04-26-v2.5-phase2.5-cleanup.md](2026-04-26-v2.5-phase2.5-cleanup.md).
|
||||||
|
- Conventions: [`CLAUDE.md`](../../CLAUDE.md) §"Behavioral defaults" + §"Phase 4 status".
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Pre-flight
|
||||||
|
|
||||||
|
**Branch:** create `phase-4.5` from the latest `main`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git checkout main && git pull && git checkout -b phase-4.5
|
||||||
|
```
|
||||||
|
|
||||||
|
**Schema baseline:** Phase 4 leaves the DB at version 13. Phase 4.5 adds two migrations: `0014_phase45_schema.sql` (T109) and `0015_vec0_virtual_tables.sql` (T115 — only lands if T115 ships). Final schema version: 14 or 15.
|
||||||
|
|
||||||
|
**Optional pre-flight for T115 (sqlite-vec swap):**
|
||||||
|
|
||||||
|
The host Python build needs `enable_load_extension`. Two options:
|
||||||
|
|
||||||
|
1. **Rebuild Python** via pyenv with `PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" pyenv install 3.12.0 --force` and recreate the venv.
|
||||||
|
2. **Add `apsw`** as a dependency and migrate `chat/db/connection.py` to use `apsw.Connection` (significant refactor — the entire codebase uses stdlib `sqlite3`).
|
||||||
|
|
||||||
|
If neither is acceptable, **defer T115** to Phase 5 and ship Phase 4.5 with 13 tasks instead of 14. The other tasks are unaffected.
|
||||||
|
|
||||||
|
**Pinned non-negotiables (carried forward):**
|
||||||
|
|
||||||
|
- State changes go through the event log. Use `append_and_apply` for the live path.
|
||||||
|
- Witness filter every memory read at SQL level.
|
||||||
|
- TDD: every task starts with a failing test (or a regression test pinning existing contract before refactor).
|
||||||
|
- One commit per task minimum. Bundled tasks split internally.
|
||||||
|
|
||||||
|
**Verification before claiming done:** Use `superpowers-extended-cc:verification-before-completion` — run the test command, paste actual output.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Backlog item → task mapping
|
||||||
|
|
||||||
|
24 items consolidated into 14 tasks by **file ownership**:
|
||||||
|
|
||||||
|
| # | Item | Source | Task |
|
||||||
|
|---|------|--------|------|
|
||||||
|
| 1 | `embeddings` FK lacks `ON DELETE CASCADE` | T88 | **T109** (schema migration) |
|
||||||
|
| 2 | `list_branches(chat_id=...)` global-branch leak — document | T89 | **T103** |
|
||||||
|
| 3 | Branch-switch silently leaves zero active — log warning | T89 | **T103** |
|
||||||
|
| 4 | Real embedding model swap | T91 / deferred | **T112** |
|
||||||
|
| 5 | `timeout_s` fallback-path logging | T91 | **T107** |
|
||||||
|
| 6 | Duplicate `MAX(id)` lookup in retrieval ranking | T96 | **T104** |
|
||||||
|
| 7 | `fts_rank=None` for vector-only rows — document | T96 | **T104** |
|
||||||
|
| 8 | `event_id <= 0` guard in `delete_turn` | T98 | **T110** |
|
||||||
|
| 9 | `html.escape()` on delete-impact modal output | T98 | **T110** |
|
||||||
|
| 10 | Extract delete-impact modal to Jinja partial | T98 | **T110** |
|
||||||
|
| 11 | Hoist `datetime`/`timezone` imports in `snapshots.py` | T99 | **T105** |
|
||||||
|
| 12 | Strict `kind` validation in snapshot routes | T99 | **T105** |
|
||||||
|
| 13 | `created_at` from file mtime — document drift risk | T99 | **T105** |
|
||||||
|
| 14 | Hardcoded `k=50` → module constant | T100 | **T106** |
|
||||||
|
| 15 | N+1 lookups in search results | T100 | **T106** |
|
||||||
|
| 16 | FTS highlighting via `snippet()` | T100 | **T111** |
|
||||||
|
| 17 | Result links chat-level only — add deep-link via memories.event_id | T100 | **T109** + **T111** |
|
||||||
|
| 18 | sqlite-vec swap when host Python supports loadable extensions | deferred | **T115** |
|
||||||
|
| 19 | Branching read-side filter (consult `is_active`) | deferred | **T113** |
|
||||||
|
| 20 | Bulk significance re-rate in drawer | deferred | **T110** |
|
||||||
|
| 21 | Vector index optimization (HNSW) | deferred | **T115** (post-ship note) |
|
||||||
|
| 22 | Scene-close-on-cancel UX revisit | Phase 2.5 carry-over | **T108** |
|
||||||
|
| 23 | Cross-feature canned-queue brittleness fixture builder | Phase 3 carry-over | **T116** |
|
||||||
|
| 24 | Full lifecycle-rollback in regenerate | Phase 3.5 carry-over | **T114** |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Parallel-Execution Strategy
|
||||||
|
|
||||||
|
Same pattern as Phase 3.5 / Phase 2.5 / Phase 4. Nine waves: parallel within each wave (file-disjoint), serial across waves.
|
||||||
|
|
||||||
|
### How to dispatch a wave in parallel
|
||||||
|
|
||||||
|
Use the **Agent tool with `isolation: "worktree"`**. (If the controlling session's working directory is **not** the chat repo, create worktrees manually with `git worktree add .worktrees/<wave>-<task> -b <wave>/<task> phase-4.5`.)
|
||||||
|
|
||||||
|
### After a wave completes
|
||||||
|
|
||||||
|
1. Each subagent returns its worktree path and commit SHA(s).
|
||||||
|
2. **Run a spec + code-quality reviewer subagent on each completed task.** Combined review acceptable for trivial tasks (T103–T108); separate spec + quality reviewers for big tasks (T112, T113, T114, T115).
|
||||||
|
3. **Merge the wave into `phase-4.5`** in any order (file-disjointness guarantees no conflict). Use `--no-ff`.
|
||||||
|
4. **Run the full test suite** on the merged `phase-4.5`.
|
||||||
|
5. **Push `phase-4.5`** to gitea.
|
||||||
|
6. Optionally clean up worktrees.
|
||||||
|
|
||||||
|
### Conflict prevention checklist
|
||||||
|
|
||||||
|
For each parallel wave, verify the **Files** sections of all tasks have **no overlapping paths**. Hot files in this plan (each owned by exactly one task): `chat/state/memory.py`, `chat/web/drawer.py`, `chat/web/search.py`, `chat/services/regenerate.py`, `chat/services/turn_common.py`, `chat/services/embeddings.py`, `chat/db/migrations/`.
|
||||||
|
|
||||||
|
### Why each wave is parallel-safe
|
||||||
|
|
||||||
|
| Wave | Tasks | Hot files | Disjoint? |
|
||||||
|
|------|-------|-----------|-----------|
|
||||||
|
| 1 | T103, T104, T105, T106, T107, T108 | 6 different files; no overlap | ✅ |
|
||||||
|
| 2 | T109 | new migration + minor projector update | (single task) |
|
||||||
|
| 3 | T110 | `chat/web/drawer.py` (bundle) | (single task) |
|
||||||
|
| 4 | T111 | `chat/services/cross_chat_search.py` + `chat/web/search.py` + template | (single task; depends on T109) |
|
||||||
|
| 5 | T112 | `chat/services/embeddings.py` + `chat/llm/*.py` (Protocol + Featherless + Mock) | (single task) |
|
||||||
|
| 6 | T113 | `chat/services/turn_common.py` + multiple readers (cross-cutting) | (single task) |
|
||||||
|
| 7 | T114 | `chat/services/regenerate.py` + projector handler | (single task) |
|
||||||
|
| 8 | T115 | new migration + `chat/services/vector_search.py` + `chat/db/connection.py` | (single task; environmental) |
|
||||||
|
| 9 | T116, T117, T118 | new test fixture file (T116); new test file (T117); CLAUDE.md (T118) | ✅ |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task overview
|
||||||
|
|
||||||
|
```
|
||||||
|
Wave 1 ─┬─ T103: branches polish (global-branch doc + branch-switch warning)
|
||||||
|
├─ T104: state/memory.py polish (DRY MAX(id) + fts_rank doc)
|
||||||
|
├─ T105: snapshots.py polish (datetime hoist + kind validation + mtime doc)
|
||||||
|
├─ T106: search.py polish (k constant + N+1 batched lookups)
|
||||||
|
├─ T107: embeddings.py timeout_s fallback-path logging
|
||||||
|
└─ T108: scene-close-on-cancel UX revisit (pin behavior with regression test)
|
||||||
|
|
||||||
|
Wave 2 ─── T109: 0014 schema migration (FK CASCADE + memories.event_id column)
|
||||||
|
|
||||||
|
Wave 3 ─── T110: drawer Phase 4.5 bundle (event_id guard + html.escape + modal partial + bulk sig re-rate)
|
||||||
|
|
||||||
|
Wave 4 ─── T111: search UX enhancements (FTS snippet() highlighting + deep-link via memories.event_id)
|
||||||
|
|
||||||
|
Wave 5 ─── T112: real embedding model swap (LLMClient.embed protocol + Featherless impl + generate_embedding routing + backfill)
|
||||||
|
|
||||||
|
Wave 6 ─── T113: branching read-side filter (event readers consult is_active branch range)
|
||||||
|
|
||||||
|
Wave 7 ─── T114: regenerate lifecycle rollback (back-reference field + compensating events on supersede)
|
||||||
|
|
||||||
|
Wave 8 ─── T115: sqlite-vec swap (vec0 virtual tables + MATCH-based vector_search) [ENVIRONMENTAL — see pre-flight]
|
||||||
|
|
||||||
|
Wave 9 ─┬─ T116: structured test-fixture builder (canned-queue brittleness)
|
||||||
|
├─ T117: Phase 4.5 cross-feature integration tests
|
||||||
|
└─ T118: docs sweep — Phase 4.5 status, prune backlog, capture Phase 5 residuals
|
||||||
|
```
|
||||||
|
|
||||||
|
Critical path: 9 sequential merge points. Total tasks: 14 (or 13 if T115 deferred). Parallelism: Waves 1 (6-way) and 9 (3-way) dispatch concurrently. Waves 2–8 are single-task by hot-file constraint.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 1 — Independent small fixes (parallel, 6 tasks)
|
||||||
|
|
||||||
|
All trivial, file-disjoint. Each is 1-line + 1-test or similar.
|
||||||
|
|
||||||
|
### Task 103: branches polish
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `chat/state/branches.py`
|
||||||
|
- Modify: `tests/test_branches_state.py`
|
||||||
|
|
||||||
|
**Spec (2 sub-fixes, single commit):**
|
||||||
|
|
||||||
|
1. **Document global-branch leak**: `list_branches(chat_id=...)` filter `chat_id = ? OR chat_id IS NULL` returns global/null-chat branches (like "main") in every chat scope. Add a docstring note explaining this is intentional ("main" is global by design; per-chat branches are scoped).
|
||||||
|
|
||||||
|
2. **Warn on branch-switch to nonexistent name**: in `_apply_branch_switched`, before the SQL UPDATE, check if a branch with the given name exists. If not, emit `logging.getLogger(__name__).warning(...)` rather than silently leaving zero active branches.
|
||||||
|
|
||||||
|
**Test:** `test_branch_switched_unknown_name_warns` — capture log via `caplog`, append `branch_switched` for nonexistent name, assert warning message + no active branch (existing behavior preserved, just observable).
|
||||||
|
|
||||||
|
**Commit:** `chore: branches polish — global-leak docs + unknown-name warning (T103)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 104: state/memory.py polish
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `chat/state/memory.py`
|
||||||
|
- Modify: `tests/test_memory_search.py` (no new tests; just add docstring assertions if needed)
|
||||||
|
|
||||||
|
**Spec (2 sub-fixes):**
|
||||||
|
|
||||||
|
1. **DRY `MAX(id)` lookup**: `_composite_rerank` (Phase 3.5 T57) and `_rrf_fuse_and_rerank` (Phase 4 T96) both query `SELECT MAX(id) FROM event_log` for the recency boost. Extract a `_max_event_id(conn)` helper.
|
||||||
|
|
||||||
|
2. **`fts_rank=None` documentation**: search_memories docstring should note that vector-only rows have `fts_rank=None`. Downstream consumers must accept None (they currently do, but contract is implicit).
|
||||||
|
|
||||||
|
**Test:** existing tests cover both via the public API; no new test needed unless docstring assertion is desired.
|
||||||
|
|
||||||
|
**Commit:** `chore: memory.py DRY MAX(id) helper + document fts_rank=None contract (T104)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 105: snapshots.py polish
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `chat/web/snapshots.py`
|
||||||
|
- Modify: `tests/test_snapshot_ux.py` (1 new test)
|
||||||
|
|
||||||
|
**Spec (3 sub-fixes):**
|
||||||
|
|
||||||
|
1. **Hoist `datetime`/`timezone` imports** to module level (currently inside `_list_all_snapshots`).
|
||||||
|
|
||||||
|
2. **Strict `kind` validation in restore/preview routes**: currently `kind` defaults to `"periodic"`. If a rewind snapshot is requested without explicit `kind`, the lookup silently 404s. Reject missing `kind` with a 400 instead of silently defaulting.
|
||||||
|
|
||||||
|
3. **Document `created_at` mtime drift risk** in module docstring: snapshot timestamps come from file mtime, not the encoded filename timestamp. Files copied via `cp -p` preserve mtime; `cp` without `-p` resets it. Add a one-line note.
|
||||||
|
|
||||||
|
**Test:** `test_restore_without_kind_returns_400` — POST `/snapshots/restore/<id>` without `kind`; assert 400.
|
||||||
|
|
||||||
|
**Commit:** `chore: snapshots.py polish — hoisted imports + strict kind + mtime doc (T105)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 106: search.py polish
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `chat/web/search.py`
|
||||||
|
- Modify: `tests/test_search_ux.py` (1 new test)
|
||||||
|
|
||||||
|
**Spec (2 sub-fixes):**
|
||||||
|
|
||||||
|
1. **Hardcoded `k=50` → module constant**: extract `DEFAULT_SEARCH_K = 50` at module level. Tunable without code change at the call site.
|
||||||
|
|
||||||
|
2. **N+1 lookup batching**: GET `/search?q=...` currently calls `get_bot(conn, owner_id)`, `get_chat(conn, chat_id)`, `get_scene(conn, scene_id)` per result row (worst case 50×3 = 150 individual queries). Batch via `WHERE id IN (...)` queries: collect distinct ids first, fetch in 3 batched queries, then map back per row.
|
||||||
|
|
||||||
|
**Test:** `test_search_results_use_batched_lookups` — mock `get_bot`/`get_chat`/`get_scene` and assert each is called once (not per row). OR easier: time the search with 50 results and assert it doesn't degrade linearly with `k`.
|
||||||
|
|
||||||
|
**Commit:** `perf: search.py N+1 batching + k constant extraction (T106)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 107: embeddings.py timeout_s fallback-path logging
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `chat/services/embeddings.py`
|
||||||
|
- Modify: `tests/test_embeddings.py` (1 new test)
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
When `model != DEFAULT_EMBEDDING_MODEL` and falls through to fallback (zero-vector with model="fallback"), log a `warning` so misconfigured callers (e.g., a Phase 4.5+ caller pointing at a real model that doesn't exist) don't silently degrade.
|
||||||
|
|
||||||
|
```python
|
||||||
|
if model != DEFAULT_EMBEDDING_MODEL:
|
||||||
|
_log.warning(
|
||||||
|
"generate_embedding: non-default model %r returned fallback "
|
||||||
|
"(model client.embed() not yet implemented in Phase 4.5+); "
|
||||||
|
"downstream search will degrade silently. Configure a supported model.",
|
||||||
|
model,
|
||||||
|
)
|
||||||
|
return EmbeddingResult(...) # fallback
|
||||||
|
```
|
||||||
|
|
||||||
|
The Phase 4 default path (`model == DEFAULT_EMBEDDING_MODEL` → pseudo-embedding) is silent; only non-default models trigger the warning.
|
||||||
|
|
||||||
|
**Test:** `test_generate_embedding_non_default_model_logs_warning` — call with `model="real-model"`; capture log via `caplog`; assert the warning message appears.
|
||||||
|
|
||||||
|
**Commit:** `chore: embeddings.py warns on fallback for non-default models (T107)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 108: scene-close-on-cancel UX revisit
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `tests/test_turn_flow.py` (extend the existing pin test added in Phase 2.5 T74.3 OR add a new one)
|
||||||
|
- Optionally modify: `chat/web/turns.py` if a real bug surfaces during investigation
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
This carry-over has been pending since Phase 2.5 T74.3. The pinned behavior: scene close fires even when the primary turn is cancelled mid-stream, because `detect_scene_close` consults user prose (fully present at cancel time), not bot output.
|
||||||
|
|
||||||
|
**Action:**
|
||||||
|
|
||||||
|
1. **Re-investigate** by reading the post_turn cancellation path. Confirm the rationale still holds (it should — nothing about the close-detection logic changed in Phase 3 or 4).
|
||||||
|
2. **Strengthen the regression test** in `tests/test_turn_flow.py` (the existing `test_cancelled_turn_still_closes_scene_when_user_prose_signals_close`). Add an assertion that the user prose IS present at the moment scene_close_decision fires (even though the bot output isn't).
|
||||||
|
3. If investigation surfaces an actual UX issue (e.g., the close fires too eagerly on prose like "fade out... actually wait"), this becomes a real fix — but default action is documentation-only.
|
||||||
|
|
||||||
|
**Default outcome:** add a docstring comment to the post_turn close-detection branch explaining the rationale. No behavioral change.
|
||||||
|
|
||||||
|
**Test (extend existing):** assert ordering — `scene_closed` event lands AFTER the user_turn event but BEFORE any potential assistant_turn (which is cancelled). Pin the contract.
|
||||||
|
|
||||||
|
**Commit:** `chore: scene-close-on-cancel — strengthen regression test + document rationale (T108)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 2 — Schema migration (single)
|
||||||
|
|
||||||
|
### Task 109: 0014 schema migration
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Create: `chat/db/migrations/0014_phase45_schema.sql`
|
||||||
|
- Modify: `chat/state/memory.py` or `chat/services/memory_write.py` (populate the new `event_id` column on memory_written)
|
||||||
|
- Modify: `tests/test_world.py` (bump schema_version assertion to 14)
|
||||||
|
- Modify: `tests/test_memory_write.py` (assert event_id populated)
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
Two schema changes bundled into a single migration:
|
||||||
|
|
||||||
|
1. **`embeddings.memory_id` FK gets `ON DELETE CASCADE`** (T88 review nit). SQLite doesn't support `ALTER TABLE ... ALTER COLUMN`, so the standard pattern is: rename old table, create new, copy data, drop old, recreate indices. Alternatively, since this is a new-ish table (Phase 4 added it) and the change is purely defensive, document as "WONTFIX in 4.5; deindex events remain the only deletion path; ON DELETE CASCADE remains a Phase 5 candidate when we do a broader migration cleanup". Choose pragmatically.
|
||||||
|
|
||||||
|
2. **Add `memories.event_id INTEGER` column** (NULL allowed for backward compat) referencing `event_log.id`. This is the foundation for T111's deep-linking from cross-chat search results to specific turns. Migration adds the column; the projector for `memory_written` populates it from the event id when projecting.
|
||||||
|
|
||||||
|
**Production code change:** in the `memory_written` projector handler (in `chat/state/memory.py` or wherever it lives), populate the new `event_id` column with the projecting event's `id`. The `Event` object has `id` available in the projector context.
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
|
||||||
|
1. `test_schema_version_after_migration_is_14` (rename + bump from 13).
|
||||||
|
2. `test_memory_written_populates_event_id` — append memory_written; project; query memories table; assert `event_id` is the projecting event's id.
|
||||||
|
3. (Backward compat) older memories from existing seed data have NULL `event_id` — the column is nullable.
|
||||||
|
|
||||||
|
**Commit:** `feat: 0014 schema — embeddings FK CASCADE (deferred or applied) + memories.event_id column (T109)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 3 — Drawer Phase 4.5 bundle (single)
|
||||||
|
|
||||||
|
### Task 110: drawer polish + bulk significance re-rate
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `chat/web/drawer.py`
|
||||||
|
- Modify: `chat/templates/_drawer.html`
|
||||||
|
- Create: `chat/templates/_delete_impact_modal.html` (extracted partial)
|
||||||
|
- Modify: `chat/state/manual_edit.py` (potentially — if bulk re-rate emits a new manual_edit kind)
|
||||||
|
- Modify: `tests/test_drawer_phase4.py` (extend with 4-5 new tests)
|
||||||
|
|
||||||
|
**Spec (4 sub-fixes, 4 commits):**
|
||||||
|
|
||||||
|
1. **`event_id <= 0` guard in `delete_turn`** (T98 nit): currently silently rewinds everything if `event_id` is 0. Add `if event_id <= 0: raise HTTPException(400, "...")`.
|
||||||
|
|
||||||
|
2. **`html.escape()` on delete-impact modal** (T98 nit): the rendered HTML in `compute_delete_impact` output is built via raw f-strings from model-controlled strings. Wrap user-controllable fields with `html.escape()`. Defense-in-depth — currently safe, but if event payload fields ever appear in descriptions, autoescape would prevent XSS.
|
||||||
|
|
||||||
|
3. **Extract delete-impact modal HTML to a Jinja partial**: create `chat/templates/_delete_impact_modal.html`; render via `templates.TemplateResponse(...)` instead of f-string concatenation. Inherits Jinja2 autoescape automatically. Tests use the existing TestClient pattern.
|
||||||
|
|
||||||
|
4. **Bulk significance re-rate** (T98.2 deferral): drawer panel showing memory significance distribution per chat. New POST route `/chats/{chat_id}/drawer/memory/significance/bulk` accepting `{level_from, level_to}` form fields. Updates ALL memories in the chat at `level_from` to `level_to` via a sequence of `manual_edit` events (one per memory — preserves the audit trail).
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
|
||||||
|
1. `test_delete_turn_with_event_id_zero_returns_400`.
|
||||||
|
2. `test_delete_impact_modal_uses_jinja_partial` (assert response renders the partial template; verify with `assert b"<div class=\"delete-impact-modal\">" in response.content` or similar).
|
||||||
|
3. `test_delete_impact_modal_escapes_user_controllable_strings` — seed an event with a payload containing `<script>` in a description-bound field; render preview; assert it appears HTML-escaped.
|
||||||
|
4. `test_bulk_significance_re_rate_emits_manual_edit_per_memory` — seed 5 memories at significance 0; bulk re-rate to 2; assert 5 `manual_edit` events landed.
|
||||||
|
|
||||||
|
**Commits (4):**
|
||||||
|
- `fix: drawer delete_turn guards event_id <= 0 (T110.1)`
|
||||||
|
- `fix: drawer delete-impact modal HTML escapes user-controllable fields (T110.2)`
|
||||||
|
- `refactor: drawer delete-impact modal extracted to Jinja partial (T110.3)`
|
||||||
|
- `feat: drawer bulk significance re-rate per chat (T110.4)`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 4 — Search UX enhancements (single)
|
||||||
|
|
||||||
|
### Task 111: FTS highlighting + deep-link to turn
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `chat/services/cross_chat_search.py`
|
||||||
|
- Modify: `chat/web/search.py`
|
||||||
|
- Modify: `chat/templates/search.html`
|
||||||
|
- Modify: `tests/test_search_ux.py`
|
||||||
|
|
||||||
|
**Spec (2 sub-fixes, 2 commits):**
|
||||||
|
|
||||||
|
1. **FTS highlighting via `snippet()`** (T100 nit): replace the `pov_summary` column in `search_all_memories`'s SELECT with `snippet(memories_fts, 0, '<mark>', '</mark>', '…', 32)` to return a highlighted snippet around the match. The template renders this raw via `|safe` (the snippet is built by SQLite from indexed content; the `<mark>` tags are the only HTML, and SQLite escapes any HTML special chars in the source content).
|
||||||
|
|
||||||
|
2. **Deep-link to turn via memories.event_id** (T100 nit + T109 dependency): now that `memories.event_id` exists (from T109), each search result row knows the originating event id. The chat page uses turn-id stamping (Phase 3.5 T86 added `id="turn-{event_id}"`). Build result links as `/chats/{chat_id}#turn-{event_id}`. The chat page DOM scrolls to the anchor on load (browser default).
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
|
||||||
|
1. `test_search_results_include_fts_snippet_with_highlight` — seed memory with text containing "rabbit"; search for "rabbit"; assert response body contains `<mark>rabbit</mark>` (or whatever marker the snippet uses).
|
||||||
|
2. `test_search_result_link_includes_turn_anchor` — seed memory with known event_id; search; assert link href contains `#turn-{event_id}`.
|
||||||
|
|
||||||
|
**Commits (2):**
|
||||||
|
- `feat: cross-chat search FTS snippet highlighting (T111.1)`
|
||||||
|
- `feat: cross-chat search deep-links to turn via memories.event_id (T111.2)`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 5 — Real embedding model (single)
|
||||||
|
|
||||||
|
### Task 112: Real embedding model swap
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `chat/llm/client.py` (Protocol — add `embed(text, model) -> list[float]` method)
|
||||||
|
- Modify: `chat/llm/featherless.py` (FeatherlessClient — implement `embed` against Featherless `/v1/embeddings` endpoint OR equivalent)
|
||||||
|
- Modify: `chat/llm/mock.py` (MockLLMClient — accept canned embedding vectors)
|
||||||
|
- Modify: `chat/services/embeddings.py` (route non-default model through `client.embed()`)
|
||||||
|
- Modify: `chat/config.py` (add `embedding_model: str` setting; default to current pseudo)
|
||||||
|
- Modify: `scripts/backfill_embeddings.py` (re-embed-all option for model swaps)
|
||||||
|
- Modify: `tests/test_embeddings.py` + `tests/test_llm_mock.py` + `tests/test_featherless.py` (if exists)
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
Phase 4 ships a deterministic SHA-256 pseudo-embedding (deterministic but semantically meaningless). T112 wires the path for a real embedding model.
|
||||||
|
|
||||||
|
**Steps:**
|
||||||
|
|
||||||
|
1. **Extend `LLMClient` Protocol** with `async def embed(self, text: str, *, model: str) -> list[float]`.
|
||||||
|
|
||||||
|
2. **Implement on FeatherlessClient**: call the Featherless OpenAI-compatible `/v1/embeddings` endpoint:
|
||||||
|
```python
|
||||||
|
response = await self._http.post(
|
||||||
|
"/v1/embeddings",
|
||||||
|
json={"model": model, "input": text},
|
||||||
|
headers={"Authorization": f"Bearer {self._api_key}"},
|
||||||
|
)
|
||||||
|
data = response.json()
|
||||||
|
return data["data"][0]["embedding"]
|
||||||
|
```
|
||||||
|
Handle rate limits (existing 2-conn semaphore covers this).
|
||||||
|
|
||||||
|
3. **Implement on MockLLMClient**: `embed` pops a canned vector from a new `canned_embeddings` queue. Tests configure this queue.
|
||||||
|
|
||||||
|
4. **Update `generate_embedding`**: when `model != DEFAULT_EMBEDDING_MODEL`, call `client.embed(text, model=model)` instead of falling through to fallback. Wrap in try/except — failures fall back to zero vector (existing fallback path).
|
||||||
|
|
||||||
|
5. **Settings**: add `embedding_model: str = "pseudo-sha256-384"` to `Settings`. App reads this at startup; the embedding worker (`chat/services/embedding_worker.py`) passes it through.
|
||||||
|
|
||||||
|
6. **Backfill script**: add `--re-embed-all` flag that walks ALL memories (regardless of existing `embeddings_meta` rows) and re-embeds with the configured model. Useful for swapping models.
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
|
||||||
|
1. `test_embed_routes_to_client_when_non_default_model` — mock client with canned vector; call `generate_embedding(model="bge-small-en-v1.5")`; assert vector matches the canned response.
|
||||||
|
2. `test_embed_falls_back_on_client_failure` — mock client to raise; assert returns zero vector with model="fallback".
|
||||||
|
3. `test_mock_llm_client_embed_pops_canned`.
|
||||||
|
4. `test_featherless_embed_calls_correct_endpoint` (if there's an existing featherless test pattern; otherwise mock the HTTP layer).
|
||||||
|
|
||||||
|
**Commits:**
|
||||||
|
- `feat: LLMClient Protocol gains embed() method (T112.1)`
|
||||||
|
- `feat: FeatherlessClient.embed() against /v1/embeddings (T112.2)`
|
||||||
|
- `feat: generate_embedding routes non-default models through client.embed (T112.3)`
|
||||||
|
- `feat: backfill_embeddings --re-embed-all flag for model swaps (T112.4)`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 6 — Branching read-side filter (single, BIG)
|
||||||
|
|
||||||
|
### Task 113: Branching read-side filter
|
||||||
|
|
||||||
|
**Files (cross-cutting):**
|
||||||
|
- Modify: `chat/services/turn_common.py::read_recent_dialogue` — filter events to active branch's range
|
||||||
|
- Modify: `chat/services/scene_summarize.py::_read_recent_dialogue` (similar)
|
||||||
|
- Modify: `chat/state/memory.py::search_memories` — memories should be filtered to active branch (memories.event_id from T109 enables this)
|
||||||
|
- Modify: `chat/state/branches.py` — add helper `active_branch_event_ids(conn) -> tuple[int, int]` returning (origin, head)
|
||||||
|
- Add tests across multiple files
|
||||||
|
- Modify: `tests/test_branching.py` — add cross-feature tests
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
Phase 4 T89 + T94 shipped branching as metadata-only (the table tracks branches; the drawer UI can switch). But event readers DON'T consult `is_active` — they read the entire event_log. So switching branches has no functional effect.
|
||||||
|
|
||||||
|
T113 wires the filter:
|
||||||
|
|
||||||
|
1. **Helper** `active_branch_event_ids(conn) -> tuple[int, int]`: returns `(origin_event_id, head_event_id)` for the currently active branch. For "main" with origin=0 + head=N, returns `(0, N)` meaning "all events visible".
|
||||||
|
|
||||||
|
2. **Apply filter** in every event reader that returns historical state:
|
||||||
|
- `read_recent_dialogue`: WHERE clause adds `id BETWEEN ? AND ?` (the active branch's range).
|
||||||
|
- `search_memories`: WHERE clause adds `m.event_id BETWEEN ? AND ?` (uses T109's column).
|
||||||
|
- `scene_summarize._read_recent_dialogue`: same as turn_common.
|
||||||
|
- Other readers TBD — grep for `event_log` SELECT patterns and audit each one.
|
||||||
|
|
||||||
|
3. **Branches that diverge**: when branch B is created from event 10 and then accumulates events 11-15 (which only exist on B's timeline), but main also accumulates 11-12, the events overlap by id range. This is OK because event reads filter by `id <= active_branch.head_event_id`. The simpler model: branches share event_log ids globally, but each branch's "head" defines which ids are visible.
|
||||||
|
|
||||||
|
4. **Events written under branch B** carry an implicit branch tag — but the event_log table has no `branch_id` column today. T113 punts on cross-branch event writes (they all land in the global log) and relies on the `head_event_id` filter to scope reads. This is a Phase 4.5+ first cut; full branch-isolated event_log is Phase 5+.
|
||||||
|
|
||||||
|
**Edge cases:**
|
||||||
|
|
||||||
|
- Active branch has `head_event_id = 0` (just created): readers return empty.
|
||||||
|
- No active branch: readers fall through to "all events visible" (defensive).
|
||||||
|
- Switching branches mid-flight: each `read_recent_dialogue` call re-queries `active_branch`, so it's always current. No caching.
|
||||||
|
|
||||||
|
**Tests:** 5+ minimum.
|
||||||
|
|
||||||
|
1. `test_read_recent_dialogue_respects_active_branch_head` — seed 10 events; active branch head = 5; assert only first 5 returned.
|
||||||
|
2. `test_search_memories_respects_active_branch_head` — same.
|
||||||
|
3. `test_branch_switch_changes_visible_events` — switch branches; immediately read; assert different result sets.
|
||||||
|
4. `test_main_branch_with_head_zero_returns_empty` — defensive.
|
||||||
|
5. `test_no_active_branch_falls_through_to_all_events` — defensive.
|
||||||
|
|
||||||
|
**Commit:** `feat: branching read-side filter — event readers consult active branch range (T113)`.
|
||||||
|
|
||||||
|
**This is the largest task in Phase 4.5.** Estimate 200-400 lines across multiple files. Implementer should split commits if it helps clarity (one per affected reader).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 7 — Lifecycle rollback in regenerate (single)
|
||||||
|
|
||||||
|
### Task 114: Lifecycle rollback
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `chat/services/regenerate.py`
|
||||||
|
- Modify: `chat/db/migrations/0014_phase45_schema.sql` (T109's migration) — add column? OR
|
||||||
|
- Add new migration — see decision below
|
||||||
|
- Modify: tests in `tests/test_regenerate.py`
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
Phase 3.5 T83.4 shipped a warning log when regenerate detects un-rolled-back lifecycle transitions. T114 implements actual rollback.
|
||||||
|
|
||||||
|
**Schema decision:**
|
||||||
|
|
||||||
|
Option A: extend lifecycle event payloads with `triggered_by_assistant_turn_id` (no schema change needed — just a payload convention). Production code (T61 turn flow) populates it when emitting `event_started`/`event_completed`/`event_cancelled`. Existing rows have NULL — rollback skips them with a debug log.
|
||||||
|
|
||||||
|
Option B: add a column to `event_log` for stronger invariants. Significant migration cost.
|
||||||
|
|
||||||
|
**Recommended:** Option A. Safer, no migration, backward compatible (older events skip rollback). Document in commit body.
|
||||||
|
|
||||||
|
**Rollback semantics:**
|
||||||
|
|
||||||
|
When regenerate detects lifecycle events triggered by the superseded turn:
|
||||||
|
- `event_started` → emit `event_cancelled` (or a NEW `event_started_undone` event kind that reverts status to "planned") with the same event_id.
|
||||||
|
- `event_completed` → emit `event_uncompleted` (NEW event kind that reverts status from "completed" to "active").
|
||||||
|
- `event_cancelled` → emit `event_uncancelled` (reverts to prior status — which we'd need to track; or simpler: emit `event_started` again to restore "active").
|
||||||
|
|
||||||
|
**Simpler approach (recommended):** add ONE new event kind `event_status_reverted` with payload `{event_id, prior_status}`. The projector sets `events.status = prior_status` for the event_id. Rollback emits this event for each affected lifecycle transition, looking up the prior status from the row's history (via event_log scan) or accepting it as a payload field.
|
||||||
|
|
||||||
|
**Production code change:** in `chat/web/turns.py::post_turn` (and `chat/services/regenerate.py`), when emitting `event_started`/`event_completed`/`event_cancelled`, populate `triggered_by_assistant_turn_id: <id>` in the payload. Forward-only — older code doesn't need updating.
|
||||||
|
|
||||||
|
**Tests:** 3 minimum.
|
||||||
|
|
||||||
|
1. `test_regenerate_rolls_back_event_started_from_superseded_turn` — seed an event; play a turn that starts it; regenerate; assert `event_status_reverted` event landed with `prior_status="planned"` and the events row is back to "planned".
|
||||||
|
2. `test_regenerate_rolls_back_event_completed_to_active` — same but completed → active rollback.
|
||||||
|
3. `test_regenerate_skips_events_without_back_reference` — older events without `triggered_by_assistant_turn_id` are not rolled back (debug log). Pin the backward-compat behavior.
|
||||||
|
|
||||||
|
**Commits:**
|
||||||
|
- `feat: lifecycle events carry triggered_by_assistant_turn_id back-reference (T114.1)`
|
||||||
|
- `feat: event_status_reverted event kind + projector handler (T114.2)`
|
||||||
|
- `feat: regenerate rolls back lifecycle transitions on supersede (T114.3)`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 8 — sqlite-vec swap (single, ENVIRONMENTAL)
|
||||||
|
|
||||||
|
### Task 115: sqlite-vec swap (optional)
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Create: `chat/db/migrations/0015_vec0_virtual_tables.sql`
|
||||||
|
- Modify: `chat/db/connection.py` (load extension on every connection)
|
||||||
|
- Modify: `chat/services/vector_search.py` (rewrite to use vec0 MATCH instead of pure-Python cosine)
|
||||||
|
- Modify: `chat/state/embeddings.py` (writer needs to populate vec0 table)
|
||||||
|
- Modify: `pyproject.toml` (add `sqlite-vec` dependency)
|
||||||
|
|
||||||
|
**Pre-flight:**
|
||||||
|
|
||||||
|
This task REQUIRES one of:
|
||||||
|
- Python rebuilt with `--enable-loadable-sqlite-extensions` (pyenv reinstall).
|
||||||
|
- `apsw` migration of `chat/db/connection.py`.
|
||||||
|
|
||||||
|
If neither is feasible at the time of execution: SKIP THIS TASK and document the deferral in T118 docs sweep. The other 13 Phase 4.5 tasks ship without it.
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
1. **Migration** `0015_vec0_virtual_tables.sql`:
|
||||||
|
```sql
|
||||||
|
CREATE VIRTUAL TABLE embeddings_vec USING vec0(
|
||||||
|
memory_id INTEGER PRIMARY KEY,
|
||||||
|
embedding FLOAT[384]
|
||||||
|
);
|
||||||
|
-- Backfill from existing JSON embeddings table.
|
||||||
|
INSERT INTO embeddings_vec (memory_id, embedding)
|
||||||
|
SELECT memory_id, vec_f32(vector_json) FROM embeddings;
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **`chat/db/connection.py`** loads `sqlite_vec` extension on every connection:
|
||||||
|
```python
|
||||||
|
import sqlite_vec
|
||||||
|
def open_db(...):
|
||||||
|
conn = sqlite3.connect(...)
|
||||||
|
conn.enable_load_extension(True)
|
||||||
|
sqlite_vec.load(conn)
|
||||||
|
conn.enable_load_extension(False)
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Rewrite `vector_search.py`** to use `embeddings_vec MATCH ?` syntax with `k=?` clause:
|
||||||
|
```sql
|
||||||
|
SELECT m.id, m.pov_summary, m.significance, e.distance
|
||||||
|
FROM embeddings_vec e
|
||||||
|
JOIN memories m ON m.id = e.memory_id
|
||||||
|
WHERE e.embedding MATCH ? AND k = ?
|
||||||
|
AND m.owner_id = ?
|
||||||
|
AND m.witness_<role> = 1
|
||||||
|
ORDER BY e.distance ASC
|
||||||
|
LIMIT ?
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **HNSW note**: vec0 supports both flat (default) and HNSW indexes. T115 ships flat (sufficient for < few thousand memories). Document HNSW upgrade path in CLAUDE.md if memory counts ever grow past pure-Python feasibility.
|
||||||
|
|
||||||
|
5. **Old `embeddings` JSON table**: keep alongside `embeddings_vec` (data redundancy is fine; the JSON table is the source of truth and `embeddings_vec` is the index). Backfill on migration. Keep the `embedding_indexed` projector populating both.
|
||||||
|
|
||||||
|
**Tests:** rewrite `tests/test_vector_search.py` to expect new behavior. Same observable contract — only implementation changes. All 5 existing tests should pass post-swap.
|
||||||
|
|
||||||
|
**Commit:** `feat: sqlite-vec swap (vec0 virtual tables + MATCH-based search) (T115)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wave 9 — Polish (parallel, 3 tasks)
|
||||||
|
|
||||||
|
### Task 116: Structured test-fixture builder
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Create: `tests/fixtures.py` (or extend `tests/conftest.py`)
|
||||||
|
- Modify: existing test files that use brittle canned-queue arrays (selectively)
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
Phase 3 carry-over. Tests across `test_turn_flow.py`, `test_meanwhile_turn_flow.py`, `test_phase3_integration.py`, `test_phase4_integration.py` use positional canned-response arrays for `MockLLMClient`. Adding a new classifier call to a code path requires updating canned arrays in many tests.
|
||||||
|
|
||||||
|
**Solution:** structured fixture builder that lets tests declare their classifier expectations by name, not position:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# tests/fixtures.py
|
||||||
|
class CannedQueue:
|
||||||
|
def __init__(self):
|
||||||
|
self._queue = []
|
||||||
|
def parse_turn(self, **fields): ...
|
||||||
|
def state_update(self, **fields): ...
|
||||||
|
def detect_scene_close(self, should_close: bool): ...
|
||||||
|
def detect_event_transitions(self, transitions: list[dict]): ...
|
||||||
|
def summarize_scene(self, summary: str, **fields): ...
|
||||||
|
def detect_threads(self, candidates: list[dict]): ...
|
||||||
|
# ... one method per classifier service
|
||||||
|
def build(self) -> list[str]:
|
||||||
|
return [json.dumps(item) for item in self._queue]
|
||||||
|
```
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def test_post_turn_with_event_transition(...):
|
||||||
|
canned = (
|
||||||
|
CannedQueue()
|
||||||
|
.parse_turn(intent="narrative")
|
||||||
|
.narrative("BotA speaks.") # narrative is a stream, but for simplicity treat it like a canned response
|
||||||
|
.state_update(affinity_delta=0, trust_delta=0)
|
||||||
|
.state_update(affinity_delta=0, trust_delta=0)
|
||||||
|
.detect_event_transitions([{"event_id": "evt_1", "new_status": "completed"}])
|
||||||
|
.detect_scene_close(should_close=False)
|
||||||
|
.build()
|
||||||
|
)
|
||||||
|
mock = MockLLMClient(canned=canned)
|
||||||
|
# ...
|
||||||
|
```
|
||||||
|
|
||||||
|
**Migration scope:** don't migrate ALL existing tests at once — that's a separate massive refactor. Instead, ship the fixture builder + migrate 2-3 representative tests as proof of concept. Document the migration path in the fixture's docstring.
|
||||||
|
|
||||||
|
**Tests:** the fixture builder itself doesn't need extensive testing — it's just a builder. Add 1-2 sanity tests that the JSON output matches expected shapes.
|
||||||
|
|
||||||
|
**Commit:** `test: structured CannedQueue fixture builder for classifier mocks (T116)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 117: Phase 4.5 cross-feature integration tests
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Create: `tests/test_phase45_integration.py`
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
End-to-end multi-feature flows specific to Phase 4.5 changes. 5 tests minimum.
|
||||||
|
|
||||||
|
1. **Real embedding swap + retrieval** — configure `embedding_model="bge-small-en-v1.5"` (mocked); write a memory; backfill or wait for worker; assert vector search returns the memory via `client.embed`-derived vector (not pseudo).
|
||||||
|
|
||||||
|
2. **Branching read-side filter end-to-end** — create a branch from turn 5; switch; play 3 turns on the branch; switch back to main; assert main's recent dialogue is missing the branch turns (read filter respects active branch's head).
|
||||||
|
|
||||||
|
3. **Lifecycle rollback** — start an event via a turn; regenerate that turn; assert lifecycle reverted (event back to "planned").
|
||||||
|
|
||||||
|
4. **Search deep-link** — write memories; search; click a result; verify the chat page renders with the right turn anchored (assert via TestClient response — either the browser anchor OR a server-side scroll-to-anchor mechanism).
|
||||||
|
|
||||||
|
5. **Bulk significance re-rate end-to-end** — seed 5 memories at significance 0; bulk re-rate via drawer; verify significance histogram updates.
|
||||||
|
|
||||||
|
**Commit:** `test: phase 4.5 cross-feature integration coverage (T117)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 118: Phase 4.5 documentation update
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `CLAUDE.md`
|
||||||
|
- Modify: `docs/plans/2026-04-26-v1-requirements-design.md` (annotate §13 Phase 4 entries — though they're already shipped per Phase 4 T102)
|
||||||
|
|
||||||
|
**Spec:**
|
||||||
|
|
||||||
|
Mirror the Phase 3.5 / 2.5 status sections. Document:
|
||||||
|
|
||||||
|
- All shipped items per task (T103–T117).
|
||||||
|
- Empty out the Phase 4.5 / 5 backlog (replace with single "All items shipped" line).
|
||||||
|
- Add new "Phase 5 backlog" section if any Phase 4.5 reviews surfaced new follow-ups.
|
||||||
|
|
||||||
|
**Phase 5 backlog candidates** (default, if no new follow-ups discovered):
|
||||||
|
|
||||||
|
- Vector index optimization (HNSW) when memory counts grow past flat-index feasibility.
|
||||||
|
- Branch-isolated event_log (each branch has its own physical event_log range vs the current shared id space + head filter).
|
||||||
|
- Embedding model swap migration tooling — when changing models, need to re-embed everything; T112 added `--re-embed-all` but a more orchestrated swap (drain old worker, re-seed all memories, swap config) is Phase 5+.
|
||||||
|
- Real-time collaborative branching (multi-user) — out of scope for v1.
|
||||||
|
- Avatars / portraits (multimodality) — deferred indefinitely per design §14.
|
||||||
|
|
||||||
|
**Commit:** `docs: phase 4.5 status, prune backlog, capture phase 5 candidates (T118)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wrap-up
|
||||||
|
|
||||||
|
After Wave 9 lands:
|
||||||
|
|
||||||
|
1. **Run full suite** on `phase-4.5`: should be ~430+ tests passing (413 from Phase 4 + ~20 new across Phase 4.5).
|
||||||
|
2. **Manual smoke** (recommended before opening the PR):
|
||||||
|
- Configure `embedding_model="bge-small-en-v1.5"` (or whatever real model is chosen); restart server; play a turn; verify `embedding_indexed` events use the real model and search returns semantically-relevant memories.
|
||||||
|
- Create a branch, switch, play turns, switch back — verify main's history is unaffected.
|
||||||
|
- Plan an event, complete it via a turn, regenerate that turn — verify event reverts to "planned".
|
||||||
|
- Use cross-chat search; click a result; verify it lands on the right turn in the chat page.
|
||||||
|
- Bulk re-rate a chat's significance distribution.
|
||||||
|
3. **Push `phase-4.5`** to gitea.
|
||||||
|
4. **Open PR** `phase-4.5 → main`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Notes for the controller running this plan
|
||||||
|
|
||||||
|
- **T115 (sqlite-vec swap)** is environmental. If pre-flight fails (no rebuilt Python, no apsw), defer to Phase 5 and ship Phase 4.5 with 13 tasks. T118 docs sweep should note the deferral.
|
||||||
|
- **T112 (real embedding swap)** assumes Featherless or similar exposes an `/v1/embeddings` endpoint. If not available, document the gap and ship the Protocol + Mock impl only (Featherless impl deferred). The pseudo path remains the default in that case — same as Phase 4.
|
||||||
|
- **T113 (branching read-side filter)** is the riskiest task. Cross-cutting. Land it on a quiet branch, test thoroughly. If integration tests break in unexpected ways, bisect the affected reader and add coverage.
|
||||||
|
- **After each parallel wave**, run a code-review subagent. Combined spec+quality acceptable for trivial tasks (T103–T108); separate spec + quality reviewers for big tasks (T112, T113, T114, T115).
|
||||||
|
- **Token-spend rough estimate**: Phase 4.5 should be ~50% the size of Phase 4 (similar number of tasks, mostly smaller). Big tasks (T112, T113, T114) bring the per-task spend up but parallelism in Wave 1 + Wave 9 brings the wall-clock down.
|
||||||
|
- **DO NOT break existing v1/v2/v3/v3.5/v4 surface contracts.** Every test file that was green at the start of Phase 4.5 must stay green at the end. The cross-feature integration tests (`tests/test_phase4_integration.py`, `tests/test_phase3_integration.py`) are particularly load-bearing.
|
||||||
@@ -0,0 +1,23 @@
|
|||||||
|
{
|
||||||
|
"planPath": "docs/plans/2026-04-27-v4.5-phase4.5-cleanup.md",
|
||||||
|
"tasks": [
|
||||||
|
{"id": 103, "subject": "T103: branches polish (global-leak doc + branch-switch warning)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||||
|
{"id": 104, "subject": "T104: state/memory.py polish (DRY MAX(id) + fts_rank doc)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||||
|
{"id": 105, "subject": "T105: snapshots.py polish (datetime hoist + kind validation + mtime doc)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||||
|
{"id": 106, "subject": "T106: search.py polish (k constant + N+1 batched lookups)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||||
|
{"id": 107, "subject": "T107: embeddings.py timeout_s fallback-path logging", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||||
|
{"id": 108, "subject": "T108: scene-close-on-cancel UX revisit (regression test pin + rationale doc)", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||||
|
{"id": 109, "subject": "T109: 0014 schema migration (FK CASCADE + memories.event_id column)", "status": "pending", "wave": 2, "parallelGroup": null},
|
||||||
|
{"id": 110, "subject": "T110: drawer Phase 4.5 bundle (event_id guard + html.escape + modal partial + bulk sig re-rate)", "status": "pending", "wave": 3, "parallelGroup": null, "blockedBy": [109]},
|
||||||
|
{"id": 111, "subject": "T111: search UX (FTS snippet highlighting + deep-link via memories.event_id)", "status": "pending", "wave": 4, "parallelGroup": null, "blockedBy": [109]},
|
||||||
|
{"id": 112, "subject": "T112: real embedding model swap (LLMClient.embed protocol + Featherless impl + routing)", "status": "pending", "wave": 5, "parallelGroup": null},
|
||||||
|
{"id": 113, "subject": "T113: branching read-side filter (event readers consult is_active branch range)", "status": "pending", "wave": 6, "parallelGroup": null, "blockedBy": [109]},
|
||||||
|
{"id": 114, "subject": "T114: regenerate lifecycle rollback (back-reference + event_status_reverted)", "status": "pending", "wave": 7, "parallelGroup": null},
|
||||||
|
{"id": 115, "subject": "T115: sqlite-vec swap (vec0 virtual tables + MATCH search) [ENVIRONMENTAL — may defer]", "status": "pending", "wave": 8, "parallelGroup": null},
|
||||||
|
{"id": 116, "subject": "T116: structured CannedQueue test fixture builder", "status": "pending", "wave": 9, "parallelGroup": "wave-9"},
|
||||||
|
{"id": 117, "subject": "T117: phase 4.5 cross-feature integration tests", "status": "pending", "wave": 9, "parallelGroup": "wave-9", "blockedBy": [110, 111, 112, 113, 114]},
|
||||||
|
{"id": 118, "subject": "T118: phase 4.5 docs sweep — prune backlog, capture phase 5 candidates", "status": "pending", "wave": 9, "parallelGroup": "wave-9", "blockedBy": [110, 111, 112, 113, 114]}
|
||||||
|
],
|
||||||
|
"lastUpdated": "2026-04-27T00:00:00Z",
|
||||||
|
"notes": "16 tasks across 9 waves consolidating all 24 items in CLAUDE.md Phase 4.5/5 backlog. Wave 1 (6-way parallel) and Wave 9 (3-way parallel) maximize parallelism. Waves 2-8 are single-task by hot-file constraint. T115 (sqlite-vec swap) requires Python rebuild OR apsw migration — environmental; may defer to Phase 5. Schema baseline 13 -> 14 (T109's 0014) -> optionally 15 (T115's 0015). Big tasks: T112 (real embedding swap), T113 (branching read-side filter — riskiest), T114 (lifecycle rollback). Uses task ids T103-T118."
|
||||||
|
}
|
||||||
@@ -8,8 +8,21 @@ Phase 4 ships the deterministic local pseudo-embedding so this script
|
|||||||
runs synchronously without a network round-trip — the LLMClient argument
|
runs synchronously without a network round-trip — the LLMClient argument
|
||||||
is not needed on the pseudo path. Phase 4.5+ will need a real client.
|
is not needed on the pseudo path. Phase 4.5+ will need a real client.
|
||||||
|
|
||||||
|
T112 (Phase 4.5) adds two flags:
|
||||||
|
|
||||||
|
* ``--re-embed-all`` walks **every** memory regardless of whether it
|
||||||
|
already has an ``embeddings`` row. Useful when swapping embedding
|
||||||
|
models — the projector is INSERT OR REPLACE, so re-emitting an event
|
||||||
|
for an existing memory replaces the prior vector. Without this flag,
|
||||||
|
the script keeps the Phase 4 behavior of only filling in gaps.
|
||||||
|
* ``--model M`` overrides ``Settings.embedding_model`` for this run.
|
||||||
|
Defaults to the configured model (which itself defaults to
|
||||||
|
``"pseudo-sha256-384"``).
|
||||||
|
|
||||||
Run from the repo root:
|
Run from the repo root:
|
||||||
.venv/bin/python scripts/backfill_embeddings.py [--limit N] [--dry-run]
|
.venv/bin/python scripts/backfill_embeddings.py [--limit N] [--dry-run]
|
||||||
|
.venv/bin/python scripts/backfill_embeddings.py --re-embed-all
|
||||||
|
.venv/bin/python scripts/backfill_embeddings.py --re-embed-all --model bge-small-en-v1.5
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
@@ -17,11 +30,12 @@ from __future__ import annotations
|
|||||||
import argparse
|
import argparse
|
||||||
import asyncio
|
import asyncio
|
||||||
|
|
||||||
from chat.config import load_settings
|
from chat.config import Settings, load_settings
|
||||||
from chat.db.connection import open_db
|
from chat.db.connection import open_db
|
||||||
from chat.db.migrate import apply_migrations
|
from chat.db.migrate import apply_migrations
|
||||||
from chat.eventlog.log import append_and_apply
|
from chat.eventlog.log import append_and_apply
|
||||||
from chat.services.embeddings import (
|
from chat.services.embeddings import (
|
||||||
|
DEFAULT_EMBEDDING_MODEL,
|
||||||
FALLBACK_EMBEDDING_MODEL,
|
FALLBACK_EMBEDDING_MODEL,
|
||||||
generate_embedding,
|
generate_embedding,
|
||||||
)
|
)
|
||||||
@@ -34,6 +48,24 @@ import chat.state.memory # noqa: F401
|
|||||||
import chat.state.world # noqa: F401
|
import chat.state.world # noqa: F401
|
||||||
|
|
||||||
|
|
||||||
|
def _build_client(settings: Settings):
|
||||||
|
"""Construct an LLMClient for the backfill run.
|
||||||
|
|
||||||
|
Default-model runs (the pseudo path) don't need a client, so we
|
||||||
|
return ``None`` and ``generate_embedding`` skips the call. Non-default
|
||||||
|
models route through the real client; injectable via monkeypatch in
|
||||||
|
tests.
|
||||||
|
"""
|
||||||
|
if settings.embedding_model == DEFAULT_EMBEDDING_MODEL:
|
||||||
|
return None
|
||||||
|
from chat.llm.featherless import FeatherlessClient
|
||||||
|
|
||||||
|
return FeatherlessClient(
|
||||||
|
api_key=settings.featherless_api_key,
|
||||||
|
base_url=settings.featherless_base_url,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
async def main() -> None:
|
async def main() -> None:
|
||||||
parser = argparse.ArgumentParser(description=__doc__)
|
parser = argparse.ArgumentParser(description=__doc__)
|
||||||
parser.add_argument(
|
parser.add_argument(
|
||||||
@@ -47,23 +79,51 @@ async def main() -> None:
|
|||||||
action="store_true",
|
action="store_true",
|
||||||
help="Print the count of memories needing embeddings, then exit.",
|
help="Print the count of memories needing embeddings, then exit.",
|
||||||
)
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--re-embed-all",
|
||||||
|
action="store_true",
|
||||||
|
help=(
|
||||||
|
"Walk every memory (not just those without an embeddings row) "
|
||||||
|
"and re-emit embedding_indexed events. Use this when swapping "
|
||||||
|
"embedding models so the existing rows get replaced."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--model",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help=(
|
||||||
|
"Embedding model identifier. Overrides Settings.embedding_model "
|
||||||
|
"for this run; default uses the configured model."
|
||||||
|
),
|
||||||
|
)
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
|
|
||||||
settings = load_settings()
|
settings = load_settings()
|
||||||
settings.db_path.parent.mkdir(parents=True, exist_ok=True)
|
settings.db_path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
apply_migrations(settings.db_path)
|
apply_migrations(settings.db_path)
|
||||||
|
|
||||||
|
model = args.model or settings.embedding_model
|
||||||
|
# Override the settings instance so ``_build_client`` sees the
|
||||||
|
# effective model when deciding whether to construct a real client.
|
||||||
|
settings = settings.model_copy(update={"embedding_model": model})
|
||||||
|
client = _build_client(settings)
|
||||||
|
|
||||||
with open_db(settings.db_path) as conn:
|
with open_db(settings.db_path) as conn:
|
||||||
sql = (
|
if args.re_embed_all:
|
||||||
"SELECT m.id, m.pov_summary FROM memories m "
|
sql = "SELECT m.id, m.pov_summary FROM memories m ORDER BY m.id"
|
||||||
"LEFT JOIN embeddings e ON e.memory_id = m.id "
|
else:
|
||||||
"WHERE e.memory_id IS NULL "
|
sql = (
|
||||||
"ORDER BY m.id"
|
"SELECT m.id, m.pov_summary FROM memories m "
|
||||||
)
|
"LEFT JOIN embeddings e ON e.memory_id = m.id "
|
||||||
|
"WHERE e.memory_id IS NULL "
|
||||||
|
"ORDER BY m.id"
|
||||||
|
)
|
||||||
if args.limit is not None:
|
if args.limit is not None:
|
||||||
sql += f" LIMIT {int(args.limit)}"
|
sql += f" LIMIT {int(args.limit)}"
|
||||||
rows = conn.execute(sql).fetchall()
|
rows = conn.execute(sql).fetchall()
|
||||||
print(f"Found {len(rows)} memories needing embeddings.")
|
mode = "re-embedding" if args.re_embed_all else "needing embeddings"
|
||||||
|
print(f"Found {len(rows)} memories {mode} (model={model}).")
|
||||||
if args.dry_run:
|
if args.dry_run:
|
||||||
return
|
return
|
||||||
|
|
||||||
@@ -71,11 +131,12 @@ async def main() -> None:
|
|||||||
skipped = 0
|
skipped = 0
|
||||||
for memory_id, text in rows:
|
for memory_id, text in rows:
|
||||||
result = await generate_embedding(
|
result = await generate_embedding(
|
||||||
client=None, # pseudo path: no client needed
|
client=client,
|
||||||
text=text or "",
|
text=text or "",
|
||||||
|
model=model,
|
||||||
)
|
)
|
||||||
if result.model == FALLBACK_EMBEDDING_MODEL:
|
if result.model == FALLBACK_EMBEDDING_MODEL:
|
||||||
print(f" Skipping memory_id={memory_id} (empty text)")
|
print(f" Skipping memory_id={memory_id} (empty text or fallback)")
|
||||||
skipped += 1
|
skipped += 1
|
||||||
continue
|
continue
|
||||||
append_and_apply(
|
append_and_apply(
|
||||||
|
|||||||
Executable
+38
@@ -0,0 +1,38 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# Start the local mlx-omni-server that serves the classifier + embedding
|
||||||
|
# models. The chat app's RoutedLLMClient routes everything except the
|
||||||
|
# narrative model to this server; with no MLX server running, classifier
|
||||||
|
# calls fail and embeddings degrade to the zero-vector fallback.
|
||||||
|
#
|
||||||
|
# Run in the foreground:
|
||||||
|
# ./scripts/start_mlx_server.sh
|
||||||
|
# Run as a background daemon (logs to data/mlx-server.log):
|
||||||
|
# ./scripts/start_mlx_server.sh --daemon
|
||||||
|
#
|
||||||
|
# Models are pulled from Hugging Face on first request; expect a delay
|
||||||
|
# the first time you exercise the classifier or embedding path.
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
|
||||||
|
VENV="${REPO_ROOT}/.mlx-venv"
|
||||||
|
LOG="${REPO_ROOT}/data/mlx-server.log"
|
||||||
|
PORT="${MLX_PORT:-10240}"
|
||||||
|
HOST="${MLX_HOST:-127.0.0.1}"
|
||||||
|
|
||||||
|
if [ ! -x "${VENV}/bin/mlx-omni-server" ]; then
|
||||||
|
echo "error: mlx-omni-server not installed in ${VENV}" >&2
|
||||||
|
echo "create the venv with:" >&2
|
||||||
|
echo " python3.12 -m venv ${VENV} && ${VENV}/bin/pip install mlx-omni-server" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ "${1:-}" = "--daemon" ]; then
|
||||||
|
mkdir -p "$(dirname "${LOG}")"
|
||||||
|
nohup "${VENV}/bin/mlx-omni-server" --host "${HOST}" --port "${PORT}" \
|
||||||
|
>>"${LOG}" 2>&1 &
|
||||||
|
echo "mlx-omni-server started in background (pid $!)"
|
||||||
|
echo "logs: ${LOG}"
|
||||||
|
else
|
||||||
|
exec "${VENV}/bin/mlx-omni-server" --host "${HOST}" --port "${PORT}"
|
||||||
|
fi
|
||||||
@@ -0,0 +1,383 @@
|
|||||||
|
"""Structured test-fixture builder for ``MockLLMClient`` canned queues.
|
||||||
|
|
||||||
|
Phase 4.5 (T116) carry-over from Phase 3. The turn-flow tests in
|
||||||
|
``test_turn_flow.py``, ``test_meanwhile_turn_flow.py``,
|
||||||
|
``test_phase3_integration.py``, and ``test_phase4_integration.py`` used
|
||||||
|
to construct ``MockLLMClient`` canned-response queues as raw positional
|
||||||
|
lists of pre-encoded JSON strings. That worked, but every time a new
|
||||||
|
classifier call landed in a code path the tests had to be patched in
|
||||||
|
many places at the right index — easy to mis-position, hard to read.
|
||||||
|
|
||||||
|
This module ships :class:`CannedQueue`, a fluent builder that lets a
|
||||||
|
test declare its classifier expectations by **name** and **order** of
|
||||||
|
call, not by index into a brittle list. Each method appends one item
|
||||||
|
to the queue and returns ``self`` for chaining; ``build()`` JSON-encodes
|
||||||
|
the items and produces the flat ``list[str]`` that
|
||||||
|
``MockLLMClient(canned=...)`` expects.
|
||||||
|
|
||||||
|
Usage
|
||||||
|
-----
|
||||||
|
|
||||||
|
>>> from tests.fixtures import CannedQueue
|
||||||
|
>>> from chat.llm.mock import MockLLMClient
|
||||||
|
>>> canned = (
|
||||||
|
... CannedQueue()
|
||||||
|
... .parse_turn(segments=[{"kind": "dialogue", "text": "hello"}])
|
||||||
|
... .narrative("Hi there.")
|
||||||
|
... .state_update()
|
||||||
|
... .state_update()
|
||||||
|
... .build()
|
||||||
|
... )
|
||||||
|
>>> mock = MockLLMClient(canned=canned)
|
||||||
|
|
||||||
|
Each method maps to a single classifier (or stream) call that the turn
|
||||||
|
flow makes, in the order the production code makes them. Picking the
|
||||||
|
right method for the slot you need keeps the test readable and lets the
|
||||||
|
builder pin sensible defaults for the fields tests don't care about.
|
||||||
|
|
||||||
|
Migration template
|
||||||
|
------------------
|
||||||
|
|
||||||
|
To migrate a positional canned-array test:
|
||||||
|
|
||||||
|
1. Identify each slot in the existing array and what classifier it
|
||||||
|
feeds. Comments above the array often spell this out — start there.
|
||||||
|
2. Replace each slot with the matching :class:`CannedQueue` method:
|
||||||
|
|
||||||
|
- ``json.dumps({"segments": [...]})`` → ``.parse_turn(segments=...)``
|
||||||
|
- bare narrative string → ``.narrative("...")``
|
||||||
|
- zero-state JSON → ``.state_update()`` (defaults are zeros)
|
||||||
|
- ``json.dumps({"addressee_id": ...})`` → ``.detect_addressee(...)``
|
||||||
|
- ``json.dumps({"should_interject": ...})`` → ``.detect_interjection(...)``
|
||||||
|
- ``json.dumps({"should_close": ...})`` → ``.detect_scene_close(...)``
|
||||||
|
- ``json.dumps({"transitions": [...]})`` → ``.detect_event_transitions(...)``
|
||||||
|
- per-POV summary JSON → ``.summarize_scene_pov(summary=...)``
|
||||||
|
3. End with ``.build()`` and pass that to
|
||||||
|
``MockLLMClient(canned=...)``. The mock's contract is unchanged.
|
||||||
|
|
||||||
|
Notes on streams
|
||||||
|
----------------
|
||||||
|
|
||||||
|
``MockLLMClient.stream`` and ``MockLLMClient.generate`` share one queue
|
||||||
|
— each pop is one entry, regardless of whether the production code
|
||||||
|
streams the response or generates it whole. The narrative service
|
||||||
|
streams; classifier services generate. The builder treats both the same:
|
||||||
|
``narrative()`` appends a raw string, the classifier methods append
|
||||||
|
JSON-encoded dicts. Both end up in the same flat ``list[str]`` that the
|
||||||
|
mock pops from in order.
|
||||||
|
|
||||||
|
The remaining tests in the suite (about 30 across the four files
|
||||||
|
mentioned above) still use positional arrays — Phase 5 work to migrate
|
||||||
|
the rest. New tests should prefer this builder.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
|
||||||
|
class CannedQueue:
|
||||||
|
"""Fluent builder for ``MockLLMClient`` canned-response queues.
|
||||||
|
|
||||||
|
Each method appends one item to an internal queue and returns
|
||||||
|
``self`` for chaining. ``build()`` returns the flat ``list[str]``
|
||||||
|
suitable for ``MockLLMClient(canned=...)``.
|
||||||
|
|
||||||
|
The queue holds either ``dict`` (JSON-encoded at ``build()`` time)
|
||||||
|
or ``str`` (passed through verbatim — used for narrative streams).
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self) -> None:
|
||||||
|
self._queue: list[Any] = []
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Narrative stream — bare string, no JSON wrapping.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def narrative(self, text: str) -> "CannedQueue":
|
||||||
|
"""Append one streaming narrative response.
|
||||||
|
|
||||||
|
``MockLLMClient.stream`` pops the next entry from the same queue
|
||||||
|
as ``generate`` — a bare string is what the streaming bot beat
|
||||||
|
consumes. Use one ``narrative()`` per assistant beat (primary,
|
||||||
|
and optionally an interjection / second beat).
|
||||||
|
"""
|
||||||
|
self._queue.append(text)
|
||||||
|
return self
|
||||||
|
|
||||||
|
def raw(self, value: str) -> "CannedQueue":
|
||||||
|
"""Append a raw string (escape hatch for non-classifier calls).
|
||||||
|
|
||||||
|
Most tests should reach for the named helpers — this is here
|
||||||
|
for one-offs the builder doesn't model yet.
|
||||||
|
"""
|
||||||
|
self._queue.append(value)
|
||||||
|
return self
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Turn parser — splits user prose into segments.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def parse_turn(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
segments: list[dict] | None = None,
|
||||||
|
intent: str = "narrative",
|
||||||
|
landing_state_hint: str = "",
|
||||||
|
**rest: Any,
|
||||||
|
) -> "CannedQueue":
|
||||||
|
"""Append one ``parse_turn`` classifier response.
|
||||||
|
|
||||||
|
``intent`` defaults to ``"narrative"``; pass ``"skip_elision"``
|
||||||
|
or ``"skip_jump"`` to exercise the natural-language skip paths.
|
||||||
|
``landing_state_hint`` carries the residual descriptor for
|
||||||
|
elision skips and is otherwise ignored.
|
||||||
|
"""
|
||||||
|
payload: dict[str, Any] = {
|
||||||
|
"segments": segments if segments is not None else [],
|
||||||
|
"intent": intent,
|
||||||
|
"landing_state_hint": landing_state_hint,
|
||||||
|
}
|
||||||
|
payload.update(rest)
|
||||||
|
self._queue.append(payload)
|
||||||
|
return self
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Multi-entity addressee classifier (T74.1).
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def detect_addressee(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
addressee_id: str,
|
||||||
|
confidence: str = "medium",
|
||||||
|
reason: str = "",
|
||||||
|
**rest: Any,
|
||||||
|
) -> "CannedQueue":
|
||||||
|
"""Append one ``detect_addressee`` classifier response."""
|
||||||
|
payload: dict[str, Any] = {
|
||||||
|
"addressee_id": addressee_id,
|
||||||
|
"confidence": confidence,
|
||||||
|
"reason": reason,
|
||||||
|
}
|
||||||
|
payload.update(rest)
|
||||||
|
self._queue.append(payload)
|
||||||
|
return self
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# State-update — one per directed edge per turn.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def state_update(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
affinity_delta: int = 0,
|
||||||
|
trust_delta: int = 0,
|
||||||
|
knowledge_facts: list | None = None,
|
||||||
|
**rest: Any,
|
||||||
|
) -> "CannedQueue":
|
||||||
|
"""Append one ``apply_state_update`` classifier response.
|
||||||
|
|
||||||
|
Defaults to a benign zero-delta payload — tests that don't care
|
||||||
|
about state mutations can call this without arguments. One call
|
||||||
|
is required per directed edge that fires after the assistant
|
||||||
|
beat (e.g. single-bot non-guest turn = 2 calls; multi-bot guest
|
||||||
|
turn = 6 calls).
|
||||||
|
"""
|
||||||
|
payload: dict[str, Any] = {
|
||||||
|
"affinity_delta": affinity_delta,
|
||||||
|
"trust_delta": trust_delta,
|
||||||
|
"knowledge_facts": (
|
||||||
|
knowledge_facts if knowledge_facts is not None else []
|
||||||
|
),
|
||||||
|
}
|
||||||
|
payload.update(rest)
|
||||||
|
self._queue.append(payload)
|
||||||
|
return self
|
||||||
|
|
||||||
|
def zero_state(self) -> "CannedQueue":
|
||||||
|
"""Alias for ``state_update()`` with all defaults — matches the
|
||||||
|
``_zero_state()`` helper in existing tests.
|
||||||
|
"""
|
||||||
|
return self.state_update()
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Interjection (T74.2) — silent witness chimes in.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def detect_interjection(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
should_interject: bool,
|
||||||
|
reason: str = "",
|
||||||
|
**rest: Any,
|
||||||
|
) -> "CannedQueue":
|
||||||
|
"""Append one ``detect_interjection`` classifier response."""
|
||||||
|
payload: dict[str, Any] = {
|
||||||
|
"should_interject": should_interject,
|
||||||
|
"reason": reason,
|
||||||
|
}
|
||||||
|
payload.update(rest)
|
||||||
|
self._queue.append(payload)
|
||||||
|
return self
|
||||||
|
|
||||||
|
def detect_interjection_targeted(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
targeted: bool,
|
||||||
|
target_id: str | None = None,
|
||||||
|
reason: str = "",
|
||||||
|
**rest: Any,
|
||||||
|
) -> "CannedQueue":
|
||||||
|
"""Append one targeted-interjection classifier response."""
|
||||||
|
payload: dict[str, Any] = {
|
||||||
|
"targeted": targeted,
|
||||||
|
"target_id": target_id,
|
||||||
|
"reason": reason,
|
||||||
|
}
|
||||||
|
payload.update(rest)
|
||||||
|
self._queue.append(payload)
|
||||||
|
return self
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Scene-close detector (T26).
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def detect_scene_close(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
should_close: bool,
|
||||||
|
reason: str = "",
|
||||||
|
**rest: Any,
|
||||||
|
) -> "CannedQueue":
|
||||||
|
"""Append one ``detect_scene_close`` classifier response."""
|
||||||
|
payload: dict[str, Any] = {
|
||||||
|
"should_close": should_close,
|
||||||
|
"reason": reason,
|
||||||
|
}
|
||||||
|
payload.update(rest)
|
||||||
|
self._queue.append(payload)
|
||||||
|
return self
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Event lifecycle (T52, T61) — per-turn transitions.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def detect_event_transitions(
|
||||||
|
self,
|
||||||
|
transitions: list[dict] | None = None,
|
||||||
|
) -> "CannedQueue":
|
||||||
|
"""Append one ``detect_event_transitions`` classifier response.
|
||||||
|
|
||||||
|
``transitions`` is a list of ``{"event_id": ..., "new_status":
|
||||||
|
"active"|"completed"|"cancelled", "reason": ...}`` dicts. Pass
|
||||||
|
an empty list (or omit the argument) to assert that the call
|
||||||
|
ran but produced no transitions; pass ``None`` for an empty
|
||||||
|
list with the same shape.
|
||||||
|
|
||||||
|
Note: when no events are seeded, ``detect_event_transitions``
|
||||||
|
short-circuits without an LLM call — in that case do NOT append
|
||||||
|
this slot.
|
||||||
|
"""
|
||||||
|
payload = {"transitions": transitions if transitions is not None else []}
|
||||||
|
self._queue.append(payload)
|
||||||
|
return self
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Per-POV scene summary (used after scene close).
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def summarize_scene_pov(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
summary: str,
|
||||||
|
knowledge_facts: list | None = None,
|
||||||
|
relationship_summary: str = "",
|
||||||
|
**rest: Any,
|
||||||
|
) -> "CannedQueue":
|
||||||
|
"""Append one per-POV scene-summary response.
|
||||||
|
|
||||||
|
Used by ``apply_scene_close_summary`` — one call per witness
|
||||||
|
once a scene closes.
|
||||||
|
"""
|
||||||
|
payload: dict[str, Any] = {
|
||||||
|
"summary": summary,
|
||||||
|
"knowledge_facts": (
|
||||||
|
knowledge_facts if knowledge_facts is not None else []
|
||||||
|
),
|
||||||
|
"relationship_summary": relationship_summary,
|
||||||
|
}
|
||||||
|
payload.update(rest)
|
||||||
|
self._queue.append(payload)
|
||||||
|
return self
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Thread detection (Phase 3 §3.3).
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def detect_threads(
|
||||||
|
self,
|
||||||
|
candidates: list[dict] | None = None,
|
||||||
|
) -> "CannedQueue":
|
||||||
|
"""Append one ``detect_threads`` classifier response.
|
||||||
|
|
||||||
|
``candidates`` is a list of ``{"action": "open"|"update",
|
||||||
|
"title": ..., "summary": ..., "existing_thread_id": ...}`` dicts.
|
||||||
|
"""
|
||||||
|
payload = {"candidates": candidates if candidates is not None else []}
|
||||||
|
self._queue.append(payload)
|
||||||
|
return self
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Meanwhile digest — narrative summary of what happened off-screen.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def meanwhile_digest(self, summary: str) -> "CannedQueue":
|
||||||
|
"""Append one meanwhile-digest narrative response.
|
||||||
|
|
||||||
|
The digest service streams the digest as plain text (not JSON)
|
||||||
|
so this is a thin wrapper over ``narrative``/``raw`` for
|
||||||
|
readability at the call site.
|
||||||
|
"""
|
||||||
|
self._queue.append(summary)
|
||||||
|
return self
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Significance scorer (background worker; rarely hit in unit tests
|
||||||
|
# but available for completeness).
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def score_significance(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
score: float = 0.0,
|
||||||
|
reason: str = "",
|
||||||
|
**rest: Any,
|
||||||
|
) -> "CannedQueue":
|
||||||
|
"""Append one significance-scoring classifier response."""
|
||||||
|
payload: dict[str, Any] = {"score": score, "reason": reason}
|
||||||
|
payload.update(rest)
|
||||||
|
self._queue.append(payload)
|
||||||
|
return self
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Build / introspection.
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def build(self) -> list[str]:
|
||||||
|
"""Return the flat ``list[str]`` queue for ``MockLLMClient``.
|
||||||
|
|
||||||
|
Dict items are JSON-encoded; string items are passed through
|
||||||
|
verbatim (so streaming responses retain their raw form).
|
||||||
|
"""
|
||||||
|
out: list[str] = []
|
||||||
|
for item in self._queue:
|
||||||
|
if isinstance(item, str):
|
||||||
|
out.append(item)
|
||||||
|
else:
|
||||||
|
out.append(json.dumps(item))
|
||||||
|
return out
|
||||||
|
|
||||||
|
def __len__(self) -> int:
|
||||||
|
return len(self._queue)
|
||||||
@@ -0,0 +1,231 @@
|
|||||||
|
"""Tests for the backfill_embeddings script (T112, Phase 4.5).
|
||||||
|
|
||||||
|
Phase 4 shipped a backfill that walked memories *without* an embedding
|
||||||
|
row and produced a vector for each (deterministic pseudo path). T112
|
||||||
|
adds a ``--re-embed-all`` flag that walks **every** memory regardless
|
||||||
|
of whether it already has an embeddings row, so operators can swap
|
||||||
|
embedding models and have the existing rows replaced (the
|
||||||
|
``embedding_indexed`` projector is INSERT OR REPLACE).
|
||||||
|
|
||||||
|
These tests exercise the script's ``main()`` directly via asyncio —
|
||||||
|
shell-out via subprocess would also work but importing keeps the
|
||||||
|
fixture surface small and the failure mode clearer.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
from unittest.mock import patch
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from chat.db.connection import open_db
|
||||||
|
from chat.db.migrate import apply_migrations
|
||||||
|
from chat.eventlog.log import append_and_apply, append_event
|
||||||
|
from chat.eventlog.projector import project
|
||||||
|
from chat.services.embeddings import DEFAULT_EMBEDDING_MODEL
|
||||||
|
|
||||||
|
# Trigger handler registration for projection.
|
||||||
|
import chat.state.embeddings # noqa: F401
|
||||||
|
import chat.state.entities # noqa: F401
|
||||||
|
import chat.state.memory # noqa: F401
|
||||||
|
import chat.state.world # noqa: F401
|
||||||
|
|
||||||
|
import scripts.backfill_embeddings as backfill
|
||||||
|
|
||||||
|
|
||||||
|
def _seed(db_path: Path, count: int) -> list[int]:
|
||||||
|
"""Seed ``count`` memory rows for ``bot_a``; return their ids."""
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="bot_authored",
|
||||||
|
payload={
|
||||||
|
"id": "bot_a",
|
||||||
|
"name": "BotA",
|
||||||
|
"persona": "...",
|
||||||
|
"voice_samples": [],
|
||||||
|
"traits": [],
|
||||||
|
"backstory": "",
|
||||||
|
"initial_relationship_to_you": "",
|
||||||
|
"kickoff_prose": "",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="chat_created",
|
||||||
|
payload={
|
||||||
|
"id": "chat_bot_a",
|
||||||
|
"host_bot_id": "bot_a",
|
||||||
|
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||||
|
"narrative_anchor": "Day 1",
|
||||||
|
"weather": "",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
for i in range(count):
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="memory_written",
|
||||||
|
payload={
|
||||||
|
"owner_id": "bot_a",
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"pov_summary": f"memory text {i}",
|
||||||
|
"witness_you": 1,
|
||||||
|
"witness_host": 1,
|
||||||
|
"witness_guest": 0,
|
||||||
|
"source": "direct",
|
||||||
|
"reliability": 1.0,
|
||||||
|
"significance": 1,
|
||||||
|
"pinned": 0,
|
||||||
|
"auto_pinned": 0,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
project(conn)
|
||||||
|
return [
|
||||||
|
r[0]
|
||||||
|
for r in conn.execute(
|
||||||
|
"SELECT id FROM memories WHERE owner_id = 'bot_a' ORDER BY id"
|
||||||
|
).fetchall()
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def _seed_embedding(db_path: Path, memory_id: int, model: str = "stale-model") -> None:
|
||||||
|
"""Insert a stale ``embedding_indexed`` event so the row already
|
||||||
|
exists in ``embeddings`` (and the default backfill would skip it)."""
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="embedding_indexed",
|
||||||
|
payload={
|
||||||
|
"memory_id": memory_id,
|
||||||
|
"model": model,
|
||||||
|
"dim": 3,
|
||||||
|
"vector": [0.0, 0.0, 0.0],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_re_embed_all_walks_every_memory(tmp_path, monkeypatch, capsys):
|
||||||
|
"""``--re-embed-all`` re-embeds memories that already have rows in
|
||||||
|
``embeddings`` (default mode skips them). After the run, every
|
||||||
|
memory should have an updated embedding tagged with the configured
|
||||||
|
model (the projector replaces stale rows in place)."""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
memory_ids = _seed(db, count=3)
|
||||||
|
# Pre-seed stale embeddings on two of the three memories so the
|
||||||
|
# default path would skip them and only ``--re-embed-all`` covers
|
||||||
|
# everything.
|
||||||
|
_seed_embedding(db, memory_ids[0])
|
||||||
|
_seed_embedding(db, memory_ids[1])
|
||||||
|
|
||||||
|
cfg = tmp_path / "config.toml"
|
||||||
|
cfg.write_text(
|
||||||
|
f'featherless_api_key = "x"\n'
|
||||||
|
f'db_path = "{db}"\n'
|
||||||
|
f'data_dir = "{tmp_path}"\n'
|
||||||
|
)
|
||||||
|
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||||
|
monkeypatch.setenv("CHAT_DB_PATH", str(db))
|
||||||
|
|
||||||
|
with patch("sys.argv", ["backfill_embeddings.py", "--re-embed-all"]):
|
||||||
|
await backfill.main()
|
||||||
|
|
||||||
|
# All three memories now have a fresh embedding tagged with the
|
||||||
|
# default pseudo model (replacing the stale rows).
|
||||||
|
with open_db(db) as conn:
|
||||||
|
rows = conn.execute(
|
||||||
|
"SELECT memory_id, model FROM embeddings ORDER BY memory_id"
|
||||||
|
).fetchall()
|
||||||
|
assert len(rows) == 3
|
||||||
|
for mid, model in rows:
|
||||||
|
assert mid in memory_ids
|
||||||
|
assert model == DEFAULT_EMBEDDING_MODEL
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_default_backfill_only_walks_missing(tmp_path, monkeypatch):
|
||||||
|
"""Without ``--re-embed-all``, the script keeps the Phase 4
|
||||||
|
behavior — memories with an existing embedding row are left
|
||||||
|
alone (their stale-model tag survives)."""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
memory_ids = _seed(db, count=2)
|
||||||
|
_seed_embedding(db, memory_ids[0], model="stale-model")
|
||||||
|
# memory_ids[1] has no embedding yet.
|
||||||
|
|
||||||
|
cfg = tmp_path / "config.toml"
|
||||||
|
cfg.write_text(
|
||||||
|
f'featherless_api_key = "x"\n'
|
||||||
|
f'db_path = "{db}"\n'
|
||||||
|
f'data_dir = "{tmp_path}"\n'
|
||||||
|
)
|
||||||
|
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||||
|
monkeypatch.setenv("CHAT_DB_PATH", str(db))
|
||||||
|
|
||||||
|
with patch("sys.argv", ["backfill_embeddings.py"]):
|
||||||
|
await backfill.main()
|
||||||
|
|
||||||
|
with open_db(db) as conn:
|
||||||
|
rows = dict(
|
||||||
|
conn.execute(
|
||||||
|
"SELECT memory_id, model FROM embeddings ORDER BY memory_id"
|
||||||
|
).fetchall()
|
||||||
|
)
|
||||||
|
# Stale row preserved; only the missing one was filled.
|
||||||
|
assert rows[memory_ids[0]] == "stale-model"
|
||||||
|
assert rows[memory_ids[1]] == DEFAULT_EMBEDDING_MODEL
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_re_embed_all_respects_model_arg(tmp_path, monkeypatch):
|
||||||
|
"""The ``--model`` flag overrides ``Settings.embedding_model``.
|
||||||
|
With a non-default model and a client that returns canned vectors,
|
||||||
|
every memory is re-embedded with the supplied model tag."""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
memory_ids = _seed(db, count=2)
|
||||||
|
_seed_embedding(db, memory_ids[0])
|
||||||
|
|
||||||
|
cfg = tmp_path / "config.toml"
|
||||||
|
cfg.write_text(
|
||||||
|
f'featherless_api_key = "x"\n'
|
||||||
|
f'db_path = "{db}"\n'
|
||||||
|
f'data_dir = "{tmp_path}"\n'
|
||||||
|
)
|
||||||
|
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||||
|
monkeypatch.setenv("CHAT_DB_PATH", str(db))
|
||||||
|
|
||||||
|
# Patch the client factory the script uses to produce a Mock with
|
||||||
|
# canned embeddings — one per memory.
|
||||||
|
from chat.llm.mock import MockLLMClient
|
||||||
|
|
||||||
|
canned_vec = [0.1] * 384
|
||||||
|
|
||||||
|
def _factory(_settings):
|
||||||
|
return MockLLMClient(
|
||||||
|
canned=[],
|
||||||
|
canned_embeddings=[list(canned_vec) for _ in memory_ids],
|
||||||
|
)
|
||||||
|
|
||||||
|
monkeypatch.setattr(backfill, "_build_client", _factory)
|
||||||
|
|
||||||
|
with patch(
|
||||||
|
"sys.argv",
|
||||||
|
[
|
||||||
|
"backfill_embeddings.py",
|
||||||
|
"--re-embed-all",
|
||||||
|
"--model",
|
||||||
|
"bge-small-en-v1.5",
|
||||||
|
],
|
||||||
|
):
|
||||||
|
await backfill.main()
|
||||||
|
|
||||||
|
with open_db(db) as conn:
|
||||||
|
rows = conn.execute(
|
||||||
|
"SELECT memory_id, model FROM embeddings ORDER BY memory_id"
|
||||||
|
).fetchall()
|
||||||
|
assert len(rows) == 2
|
||||||
|
for _, model in rows:
|
||||||
|
assert model == "bge-small-en-v1.5"
|
||||||
@@ -1,11 +1,19 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
|
||||||
from chat.db.connection import open_db
|
from chat.db.connection import open_db
|
||||||
from chat.db.migrate import apply_migrations
|
from chat.db.migrate import apply_migrations
|
||||||
from chat.eventlog.log import append_event
|
from chat.eventlog.log import append_event
|
||||||
from chat.eventlog.projector import project
|
from chat.eventlog.projector import project
|
||||||
import chat.state.branches # registers handlers
|
import chat.state.branches # registers handlers
|
||||||
from chat.state.branches import active_branch, get_branch, list_branches
|
from chat.state.branches import (
|
||||||
|
_NO_HEAD_CLAMP,
|
||||||
|
active_branch,
|
||||||
|
active_branch_event_ids,
|
||||||
|
get_branch,
|
||||||
|
list_branches,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_main_branch_bootstrapped_by_migration(tmp_path):
|
def test_main_branch_bootstrapped_by_migration(tmp_path):
|
||||||
@@ -139,3 +147,116 @@ def test_list_branches_returns_all(tmp_path):
|
|||||||
names = [b["name"] for b in list_branches(conn)]
|
names = [b["name"] for b in list_branches(conn)]
|
||||||
assert "main" in names
|
assert "main" in names
|
||||||
assert "experiment" in names
|
assert "experiment" in names
|
||||||
|
|
||||||
|
|
||||||
|
def test_branch_switched_unknown_name_warns(tmp_path, caplog):
|
||||||
|
"""Switching to a nonexistent branch logs a warning and leaves no branch active.
|
||||||
|
|
||||||
|
The previous behavior silently cleared is_active flags and applied no UPDATE
|
||||||
|
when the named branch did not exist. T103 makes that condition observable
|
||||||
|
by emitting a warning while preserving the existing (zero-active) outcome.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
with caplog.at_level(logging.WARNING, logger="chat.state.branches"):
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="branch_switched",
|
||||||
|
payload={"name": "does_not_exist"},
|
||||||
|
)
|
||||||
|
project(conn)
|
||||||
|
|
||||||
|
# A warning was emitted naming the missing branch.
|
||||||
|
warnings = [
|
||||||
|
r for r in caplog.records
|
||||||
|
if r.levelno == logging.WARNING and r.name == "chat.state.branches"
|
||||||
|
]
|
||||||
|
assert warnings, "expected a warning for unknown branch name"
|
||||||
|
assert any("does_not_exist" in r.getMessage() for r in warnings)
|
||||||
|
|
||||||
|
# Existing behavior preserved: no branch is active after the switch.
|
||||||
|
assert active_branch(conn) is None
|
||||||
|
|
||||||
|
# The unknown name was not inserted as a side effect.
|
||||||
|
assert get_branch(conn, "does_not_exist") is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_active_branch_event_ids_bootstrap_main_returns_no_clamp(tmp_path):
|
||||||
|
"""Bootstrap "main" (origin=0, head=0) reads as the no-clamp sentinel.
|
||||||
|
|
||||||
|
Migration 0013 seeds main with both event-id columns at 0; production
|
||||||
|
today never emits ``branch_head_updated`` for main, so head stays at 0
|
||||||
|
even as events accumulate. The helper treats this exact bootstrap
|
||||||
|
state as "all events visible" (lower bound 0, upper bound BIG_INT) so
|
||||||
|
every existing reader stays branch-agnostic until a non-main branch
|
||||||
|
becomes active.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
origin, head = active_branch_event_ids(conn)
|
||||||
|
assert origin == 0
|
||||||
|
assert head == _NO_HEAD_CLAMP
|
||||||
|
|
||||||
|
|
||||||
|
def test_active_branch_event_ids_no_active_branch_falls_through(tmp_path):
|
||||||
|
"""No active branch row at all → defensive ``(0, BIG_INT)``.
|
||||||
|
|
||||||
|
A switch to an unknown branch leaves zero rows with ``is_active=1``;
|
||||||
|
``active_branch`` returns None. The helper must still hand readers a
|
||||||
|
workable range (the full log) so the read pipeline doesn't crash on
|
||||||
|
an inconsistent metadata state.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
# Switching to a nonexistent branch clears is_active flags
|
||||||
|
# without setting any other branch active.
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="branch_switched",
|
||||||
|
payload={"name": "does_not_exist"},
|
||||||
|
)
|
||||||
|
project(conn)
|
||||||
|
assert active_branch(conn) is None
|
||||||
|
|
||||||
|
origin, head = active_branch_event_ids(conn)
|
||||||
|
assert origin == 0
|
||||||
|
assert head == _NO_HEAD_CLAMP
|
||||||
|
|
||||||
|
|
||||||
|
def test_active_branch_event_ids_returns_actual_range_for_non_main(tmp_path):
|
||||||
|
"""Non-main branches return their literal ``(origin, head)`` window.
|
||||||
|
|
||||||
|
A branch created at origin=10 + bumped to head=20 must surface as
|
||||||
|
(10, 20) so readers' ``BETWEEN`` clamp scopes to that window.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="branch_created",
|
||||||
|
payload={
|
||||||
|
"name": "experiment",
|
||||||
|
"origin_event_id": 10,
|
||||||
|
"head_event_id": 10,
|
||||||
|
"chat_id": "c1",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="branch_head_updated",
|
||||||
|
payload={"name": "experiment", "head_event_id": 20},
|
||||||
|
)
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="branch_switched",
|
||||||
|
payload={"name": "experiment"},
|
||||||
|
)
|
||||||
|
project(conn)
|
||||||
|
|
||||||
|
origin, head = active_branch_event_ids(conn)
|
||||||
|
assert origin == 10
|
||||||
|
assert head == 20
|
||||||
|
|||||||
@@ -129,3 +129,279 @@ def test_list_branches_with_metadata_includes_event_count(tmp_path):
|
|||||||
assert rows["exp"]["origin_event_id"] == 10
|
assert rows["exp"]["origin_event_id"] == 10
|
||||||
assert rows["exp"]["head_event_id"] == 15
|
assert rows["exp"]["head_event_id"] == 15
|
||||||
assert rows["exp"]["event_count"] == 6
|
assert rows["exp"]["event_count"] == 6
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# T113 read-side filter — cross-feature tests.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
#
|
||||||
|
# These exercise the active-branch event-id clamp through every reader
|
||||||
|
# the spec called out: ``read_recent_dialogue`` (turn_common),
|
||||||
|
# ``_read_recent_dialogue`` (scene_summarize), and ``search_memories``
|
||||||
|
# (memory). They drive the readers via real event-log inserts + branch
|
||||||
|
# switches so the integration is end-to-end.
|
||||||
|
|
||||||
|
|
||||||
|
def _seed_user_turn(conn, chat_id: str, prose: str) -> int:
|
||||||
|
return append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="user_turn",
|
||||||
|
payload={"chat_id": chat_id, "prose": prose, "segments": []},
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_read_recent_dialogue_respects_active_branch_head(tmp_path):
|
||||||
|
"""T113 spec test 1: dialogue reader clamps to active branch head.
|
||||||
|
|
||||||
|
Seed 10 user turns; create a branch with origin=1 + head=5 and switch
|
||||||
|
to it; assert ``read_recent_dialogue`` only returns the first 5
|
||||||
|
turns. (The 5 events with id 6..10 fall outside ``[1, 5]``.)
|
||||||
|
"""
|
||||||
|
from chat.services.turn_common import read_recent_dialogue
|
||||||
|
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
ids = [_seed_user_turn(conn, "c1", f"turn {i}") for i in range(10)]
|
||||||
|
# 5 events visible after the switch.
|
||||||
|
branch_from_event(
|
||||||
|
conn, name="halfway", origin_event_id=ids[0], chat_id="c1"
|
||||||
|
)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="branch_head_updated",
|
||||||
|
payload={"name": "halfway", "head_event_id": ids[4]},
|
||||||
|
)
|
||||||
|
switch_active_branch(conn, name="halfway")
|
||||||
|
|
||||||
|
rows = read_recent_dialogue(conn, "c1")
|
||||||
|
# The reader returns oldest-first, so the visible-set is the
|
||||||
|
# first 5 turns.
|
||||||
|
assert len(rows) == 5
|
||||||
|
assert [r["text"] for r in rows] == [f"turn {i}" for i in range(5)]
|
||||||
|
|
||||||
|
|
||||||
|
def test_search_memories_respects_active_branch_head(tmp_path):
|
||||||
|
"""T113 spec test 2: memory search clamps to active branch head via
|
||||||
|
``memories.event_id``. Memories whose projecting event lands outside
|
||||||
|
the clamp drop out of FTS results."""
|
||||||
|
from chat.eventlog.log import append_and_apply as _aa
|
||||||
|
from chat.state.memory import search_memories
|
||||||
|
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
# Two memories projected from real events. The projector handler
|
||||||
|
# stamps memories.event_id from the projecting event's id.
|
||||||
|
ev_a = _aa(
|
||||||
|
conn,
|
||||||
|
kind="memory_written",
|
||||||
|
payload={
|
||||||
|
"owner_id": "host_bot",
|
||||||
|
"chat_id": "c1",
|
||||||
|
"scene_id": 1,
|
||||||
|
"pov_summary": "alpha keyword present",
|
||||||
|
"witness_you": 1,
|
||||||
|
"witness_host": 1,
|
||||||
|
"witness_guest": 0,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
ev_b = _aa(
|
||||||
|
conn,
|
||||||
|
kind="memory_written",
|
||||||
|
payload={
|
||||||
|
"owner_id": "host_bot",
|
||||||
|
"chat_id": "c1",
|
||||||
|
"scene_id": 1,
|
||||||
|
"pov_summary": "alpha keyword present too",
|
||||||
|
"witness_you": 1,
|
||||||
|
"witness_host": 1,
|
||||||
|
"witness_guest": 0,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
# Branch clamps to ev_a only (head = ev_a; ev_b sits past head).
|
||||||
|
branch_from_event(
|
||||||
|
conn, name="early", origin_event_id=ev_a, chat_id="c1"
|
||||||
|
)
|
||||||
|
switch_active_branch(conn, name="early")
|
||||||
|
|
||||||
|
results = search_memories(conn, "host_bot", "host", "alpha")
|
||||||
|
# Only the first memory should surface — the second's event_id
|
||||||
|
# exceeds the active branch head.
|
||||||
|
ids = [r["event_id"] for r in results]
|
||||||
|
assert ev_a in ids
|
||||||
|
assert ev_b not in ids
|
||||||
|
|
||||||
|
|
||||||
|
def test_branch_switch_changes_visible_events(tmp_path):
|
||||||
|
"""T113 spec test 3: switching branches mid-flight changes the read
|
||||||
|
immediately. ``read_recent_dialogue`` re-queries on every call."""
|
||||||
|
from chat.services.turn_common import read_recent_dialogue
|
||||||
|
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
ids = [_seed_user_turn(conn, "c1", f"turn {i}") for i in range(6)]
|
||||||
|
|
||||||
|
branch_from_event(
|
||||||
|
conn, name="early", origin_event_id=ids[0], chat_id="c1"
|
||||||
|
)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="branch_head_updated",
|
||||||
|
payload={"name": "early", "head_event_id": ids[2]},
|
||||||
|
)
|
||||||
|
branch_from_event(
|
||||||
|
conn, name="late", origin_event_id=ids[3], chat_id="c1"
|
||||||
|
)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="branch_head_updated",
|
||||||
|
payload={"name": "late", "head_event_id": ids[5]},
|
||||||
|
)
|
||||||
|
|
||||||
|
switch_active_branch(conn, name="early")
|
||||||
|
early_rows = [r["text"] for r in read_recent_dialogue(conn, "c1")]
|
||||||
|
assert early_rows == ["turn 0", "turn 1", "turn 2"]
|
||||||
|
|
||||||
|
switch_active_branch(conn, name="late")
|
||||||
|
late_rows = [r["text"] for r in read_recent_dialogue(conn, "c1")]
|
||||||
|
assert late_rows == ["turn 3", "turn 4", "turn 5"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_main_branch_with_head_zero_returns_empty(tmp_path):
|
||||||
|
"""T113 spec test 4: a non-main branch with head=0 returns empty.
|
||||||
|
|
||||||
|
The bootstrap-main sentinel only fires for ``name=="main", origin=0,
|
||||||
|
head=0``. A different branch parked at ``origin=0, head=0`` is not a
|
||||||
|
sentinel and the ``BETWEEN 0 AND 0`` clamp filters out every real
|
||||||
|
event_log row (rowids start at 1)."""
|
||||||
|
from chat.services.turn_common import read_recent_dialogue
|
||||||
|
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
# Need a real event_log row id 1+ so the clamp's "exclude 0" actually
|
||||||
|
# has something to exclude — otherwise we trivially return [].
|
||||||
|
_seed_user_turn(conn, "c1", "turn 0")
|
||||||
|
|
||||||
|
# Force-create a branch at origin=0, head=0 (NOT main). This is an
|
||||||
|
# artificial state — production never produces it — but it's the
|
||||||
|
# cleanest way to drive the documented edge case.
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="branch_created",
|
||||||
|
payload={
|
||||||
|
"name": "stub",
|
||||||
|
"origin_event_id": 0,
|
||||||
|
"head_event_id": 0,
|
||||||
|
"chat_id": "c1",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
switch_active_branch(conn, name="stub")
|
||||||
|
|
||||||
|
rows = read_recent_dialogue(conn, "c1")
|
||||||
|
assert rows == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_active_branch_falls_through_to_all_events(tmp_path):
|
||||||
|
"""T113 spec test 5: with no active branch (e.g. a switch to an
|
||||||
|
unknown name cleared all is_active flags), readers see the full log
|
||||||
|
via the ``(0, BIG_INT)`` defensive default."""
|
||||||
|
from chat.services.turn_common import read_recent_dialogue
|
||||||
|
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
for i in range(3):
|
||||||
|
_seed_user_turn(conn, "c1", f"turn {i}")
|
||||||
|
|
||||||
|
# Switching to an unknown branch leaves zero rows with is_active=1.
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="branch_switched",
|
||||||
|
payload={"name": "missing"},
|
||||||
|
)
|
||||||
|
from chat.state.branches import active_branch as _ab
|
||||||
|
|
||||||
|
assert _ab(conn) is None
|
||||||
|
|
||||||
|
rows = read_recent_dialogue(conn, "c1")
|
||||||
|
assert [r["text"] for r in rows] == ["turn 0", "turn 1", "turn 2"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_scene_summarize_read_recent_dialogue_respects_branch(tmp_path):
|
||||||
|
"""T113: ``scene_summarize._read_recent_dialogue`` (the scene-close
|
||||||
|
summary input) also clamps to the active branch range."""
|
||||||
|
from chat.services.scene_summarize import _read_recent_dialogue
|
||||||
|
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
ids = [_seed_user_turn(conn, "c1", f"turn {i}") for i in range(6)]
|
||||||
|
|
||||||
|
branch_from_event(
|
||||||
|
conn, name="early", origin_event_id=ids[0], chat_id="c1"
|
||||||
|
)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="branch_head_updated",
|
||||||
|
payload={"name": "early", "head_event_id": ids[2]},
|
||||||
|
)
|
||||||
|
switch_active_branch(conn, name="early")
|
||||||
|
|
||||||
|
rows = _read_recent_dialogue(conn, "c1")
|
||||||
|
assert [r["text"] for r in rows] == ["turn 0", "turn 1", "turn 2"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_meanwhile_dialogue_reader_respects_branch(tmp_path):
|
||||||
|
"""T113: meanwhile prompt-context reader also clamps to the active
|
||||||
|
branch. The meanwhile reader filters by ``meanwhile_scene_id``; the
|
||||||
|
branch filter is composed on top of that filter."""
|
||||||
|
from chat.web.meanwhile import _read_recent_meanwhile_dialogue
|
||||||
|
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
# Seed user turns + meanwhile assistant turns interleaved so the
|
||||||
|
# branch-id clamp lands across both kinds.
|
||||||
|
u1 = _seed_user_turn(conn, "c1", "u1")
|
||||||
|
a1 = append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="assistant_turn",
|
||||||
|
payload={
|
||||||
|
"chat_id": "c1",
|
||||||
|
"speaker_id": "host",
|
||||||
|
"text": "a1",
|
||||||
|
"meanwhile_scene_id": 7,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
# Past-head turn should NOT appear once we switch to ``early``.
|
||||||
|
a2 = append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="assistant_turn",
|
||||||
|
payload={
|
||||||
|
"chat_id": "c1",
|
||||||
|
"speaker_id": "guest",
|
||||||
|
"text": "a2",
|
||||||
|
"meanwhile_scene_id": 7,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
branch_from_event(
|
||||||
|
conn, name="early", origin_event_id=u1, chat_id="c1"
|
||||||
|
)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="branch_head_updated",
|
||||||
|
payload={"name": "early", "head_event_id": a1},
|
||||||
|
)
|
||||||
|
switch_active_branch(conn, name="early")
|
||||||
|
|
||||||
|
rows = _read_recent_meanwhile_dialogue(conn, "c1", scene_id=7)
|
||||||
|
texts = [r["text"] for r in rows]
|
||||||
|
assert "a1" in texts
|
||||||
|
assert "a2" not in texts
|
||||||
|
# Suppress the "unused" linter warning while keeping the binding
|
||||||
|
# readable for the test narrative.
|
||||||
|
_ = a2
|
||||||
|
|||||||
@@ -0,0 +1,69 @@
|
|||||||
|
"""Pin the contract: ``_apply_chat_created`` is NOT replay-safe.
|
||||||
|
|
||||||
|
See ``docs/audits/2026-04-27-project-callers.md`` for the full audit.
|
||||||
|
The handler at ``chat/state/world.py:_apply_chat_created`` uses raw
|
||||||
|
``INSERT INTO chats ...`` and ``INSERT INTO chat_state ...`` with no
|
||||||
|
``OR REPLACE``/``OR IGNORE``. Running ``project()`` twice over the same
|
||||||
|
``chat_created`` event MUST raise ``sqlite3.IntegrityError`` on the
|
||||||
|
second pass — this is the bug that produced the 500 fixed in commit
|
||||||
|
``0f8bf94`` (and the latent equivalents fixed in this commit).
|
||||||
|
|
||||||
|
Pinning the contract here means any future "make it idempotent" change
|
||||||
|
to the handler MUST update this test, which forces a deliberate review
|
||||||
|
of the trade-offs: most notably, that ``chat_state`` columns mutated by
|
||||||
|
later events (``time_skip_elision`` bumps ``time``;
|
||||||
|
``scene_opened``/``scene_closed`` toggle ``active_scene_id``) would be
|
||||||
|
silently overwritten by an ``INSERT OR REPLACE`` on every replay. The
|
||||||
|
audit explains why we keep the handler raw-INSERT and enforce the rule
|
||||||
|
at the call site via ``append_and_apply`` instead.
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from chat.db.connection import open_db
|
||||||
|
from chat.db.migrate import apply_migrations
|
||||||
|
from chat.eventlog.log import append_event
|
||||||
|
from chat.eventlog.projector import project
|
||||||
|
import chat.state.world # noqa: F401 — import registers the handler
|
||||||
|
|
||||||
|
|
||||||
|
def _chat_payload():
|
||||||
|
return {
|
||||||
|
"id": "chat_bot_a",
|
||||||
|
"host_bot_id": "bot_a",
|
||||||
|
"guest_bot_id": None,
|
||||||
|
"initial_time": "2026-04-27T12:00:00+00:00",
|
||||||
|
"narrative_anchor": "Day 1 noon",
|
||||||
|
"weather": "clear",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def test_chat_created_handler_is_not_replay_safe(tmp_path):
|
||||||
|
"""A second projection over an extra ``chat_created`` for the same id raises.
|
||||||
|
|
||||||
|
This is the exact failure shape from incident ``0f8bf94``: a raw
|
||||||
|
INSERT against ``chats.id`` (PK) trips ``UNIQUE constraint failed``
|
||||||
|
on the second pass. If this test ever starts FAILING (i.e. the
|
||||||
|
second project() succeeds), someone has changed the handler to be
|
||||||
|
idempotent — read the audit before approving.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
# First chat_created + first project: must succeed.
|
||||||
|
append_event(conn, kind="chat_created", payload=_chat_payload())
|
||||||
|
project(conn)
|
||||||
|
|
||||||
|
# Append a SECOND chat_created with the same id. project() will
|
||||||
|
# walk both, re-INSERT the same chats row, and trip the UNIQUE
|
||||||
|
# constraint on chats.id.
|
||||||
|
append_event(conn, kind="chat_created", payload=_chat_payload())
|
||||||
|
with pytest.raises(sqlite3.IntegrityError) as exc_info:
|
||||||
|
project(conn)
|
||||||
|
# Match on the column to make sure we caught the *intended*
|
||||||
|
# constraint, not some unrelated FK/check failure that happens
|
||||||
|
# to also be an IntegrityError.
|
||||||
|
assert "chats.id" in str(exc_info.value)
|
||||||
@@ -24,3 +24,25 @@ def test_chat_db_path_env_overrides_default(tmp_path, monkeypatch):
|
|||||||
(tmp_path / "config.toml").write_text('featherless_api_key = "x"\n')
|
(tmp_path / "config.toml").write_text('featherless_api_key = "x"\n')
|
||||||
s = load_settings()
|
s = load_settings()
|
||||||
assert s.db_path == tmp_path / "alt.db"
|
assert s.db_path == tmp_path / "alt.db"
|
||||||
|
|
||||||
|
|
||||||
|
def test_embedding_model_defaults_to_pseudo(tmp_path, monkeypatch):
|
||||||
|
"""T112: ``embedding_model`` defaults to the deterministic pseudo
|
||||||
|
so existing zero-config installs keep the Phase 4 behavior."""
|
||||||
|
monkeypatch.setenv("CHAT_CONFIG_PATH", str(tmp_path / "config.toml"))
|
||||||
|
(tmp_path / "config.toml").write_text('featherless_api_key = "x"\n')
|
||||||
|
s = load_settings()
|
||||||
|
assert s.embedding_model == "pseudo-sha256-384"
|
||||||
|
|
||||||
|
|
||||||
|
def test_embedding_model_overridable_via_toml(tmp_path, monkeypatch):
|
||||||
|
"""T112: operators swap the embedding model by editing config.toml.
|
||||||
|
The new value flows through to the embedding worker at startup."""
|
||||||
|
cfg = tmp_path / "config.toml"
|
||||||
|
cfg.write_text(
|
||||||
|
'featherless_api_key = "x"\n'
|
||||||
|
'embedding_model = "bge-small-en-v1.5"\n'
|
||||||
|
)
|
||||||
|
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||||
|
s = load_settings()
|
||||||
|
assert s.embedding_model == "bge-small-en-v1.5"
|
||||||
|
|||||||
@@ -458,6 +458,183 @@ def test_t98_4_delete_invokes_rewind_and_drops_cascade(client, tmp_path):
|
|||||||
assert row is None, f"event {ev_id} should have been deleted"
|
assert row is None, f"event {ev_id} should have been deleted"
|
||||||
|
|
||||||
|
|
||||||
|
def test_delete_impact_modal_uses_jinja_partial(client, tmp_path):
|
||||||
|
"""T110.3: the modal HTML is rendered from a Jinja partial
|
||||||
|
(`_delete_impact_modal.html`) rather than f-string concatenation in
|
||||||
|
Python. Verify the partial-rendered shape: the wrapping
|
||||||
|
``delete-impact-modal`` div, the cascade list, and the confirm form.
|
||||||
|
|
||||||
|
The partial inherits Jinja2 autoescape so HTML safety follows
|
||||||
|
automatically — the explicit ``html.escape()`` calls from T110.2
|
||||||
|
become redundant once this lands.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "test.db"
|
||||||
|
_seed_chat(db)
|
||||||
|
user_id, _bot_id = _seed_turns(db)
|
||||||
|
|
||||||
|
response = client.get(
|
||||||
|
f"/chats/chat_bot_a/drawer/turn/delete-preview/{user_id}"
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
body = response.text
|
||||||
|
|
||||||
|
# Markup shape that the partial produces. Double-quoted attributes
|
||||||
|
# signal Jinja rendering (the prior f-string used single quotes).
|
||||||
|
assert '<div class="delete-impact-modal">' in body
|
||||||
|
assert '<ul class="delete-impact-cascade">' in body
|
||||||
|
# The confirm form still posts to the same delete route.
|
||||||
|
assert f"/chats/chat_bot_a/drawer/turn/delete/{user_id}" in body
|
||||||
|
assert "Confirm delete" in body
|
||||||
|
|
||||||
|
|
||||||
|
def test_delete_impact_modal_escapes_user_controllable_strings(client, tmp_path):
|
||||||
|
"""T110.2: defense-in-depth — fields embedded in the modal HTML come
|
||||||
|
from event payloads (turn prose, scene timestamps, etc.) which are
|
||||||
|
ultimately user-controllable. Wrap them with ``html.escape`` so a
|
||||||
|
payload like ``<script>alert(1)</script>`` renders as inert text and
|
||||||
|
doesn't leak through into the rendered modal as actual markup.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "test.db"
|
||||||
|
_seed_chat(db)
|
||||||
|
|
||||||
|
# Seed a user_turn whose prose contains an HTML-script payload. The
|
||||||
|
# modal renders ``description = "turn N (you: <prose excerpt>)"`` so
|
||||||
|
# the prose flows verbatim into the cascade list <li>.
|
||||||
|
with open_db(db) as conn:
|
||||||
|
evil_id = append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="user_turn",
|
||||||
|
payload={
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"prose": "<script>alert('xss')</script>",
|
||||||
|
"segments": [],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
response = client.get(
|
||||||
|
f"/chats/chat_bot_a/drawer/turn/delete-preview/{evil_id}"
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
body = response.text
|
||||||
|
|
||||||
|
# Raw <script> must NOT survive into the rendered HTML. The escaped
|
||||||
|
# form (<script>) is what we want to see instead.
|
||||||
|
assert "<script>alert" not in body
|
||||||
|
assert "<script>alert" in body
|
||||||
|
|
||||||
|
|
||||||
|
def test_bulk_significance_re_rate_emits_manual_edit_per_memory(client, tmp_path):
|
||||||
|
"""T110.4: bulk significance re-rate fans out into one
|
||||||
|
``manual_edit`` event per matching memory — preserving the per-row
|
||||||
|
audit trail (and reversibility) instead of collapsing everything
|
||||||
|
into a single bulk event.
|
||||||
|
|
||||||
|
Seed five memories at significance 0, bulk re-rate 0 -> 2, and
|
||||||
|
verify five new ``memory_significance`` ``manual_edit`` rows landed
|
||||||
|
AND every memory now sits at significance 2.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "test.db"
|
||||||
|
_seed_chat(db)
|
||||||
|
|
||||||
|
# Five memories at significance 0.
|
||||||
|
with open_db(db) as conn:
|
||||||
|
for i in range(5):
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="memory_written",
|
||||||
|
payload={
|
||||||
|
"owner_id": "bot_a",
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"pov_summary": f"low-sig memory {i}",
|
||||||
|
"witness_you": 1,
|
||||||
|
"witness_host": 1,
|
||||||
|
"witness_guest": 0,
|
||||||
|
"significance": 0,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
# Plus one memory at significance 1 to verify the re-rate is
|
||||||
|
# scoped to ``level_from`` and doesn't sweep the whole chat.
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="memory_written",
|
||||||
|
payload={
|
||||||
|
"owner_id": "bot_a",
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"pov_summary": "already-rated memory",
|
||||||
|
"witness_you": 1,
|
||||||
|
"witness_host": 1,
|
||||||
|
"witness_guest": 0,
|
||||||
|
"significance": 1,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
prior_manual_edits = conn.execute(
|
||||||
|
"SELECT COUNT(*) FROM event_log WHERE kind = 'manual_edit'"
|
||||||
|
).fetchone()[0]
|
||||||
|
|
||||||
|
response = client.post(
|
||||||
|
"/chats/chat_bot_a/drawer/memory/significance/bulk",
|
||||||
|
data={"level_from": "0", "level_to": "2"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
with open_db(db) as conn:
|
||||||
|
# Five new manual_edit rows, one per matching memory.
|
||||||
|
new_manual_edits = conn.execute(
|
||||||
|
"SELECT COUNT(*) FROM event_log WHERE kind = 'manual_edit'"
|
||||||
|
).fetchone()[0]
|
||||||
|
assert new_manual_edits - prior_manual_edits == 5
|
||||||
|
|
||||||
|
# Every emitted edit is a memory_significance edit with prior=0
|
||||||
|
# and new=2.
|
||||||
|
import json as _json
|
||||||
|
|
||||||
|
rows = conn.execute(
|
||||||
|
"SELECT payload_json FROM event_log "
|
||||||
|
"WHERE kind = 'manual_edit' "
|
||||||
|
"ORDER BY id DESC LIMIT 5"
|
||||||
|
).fetchall()
|
||||||
|
for r in rows:
|
||||||
|
payload = _json.loads(r[0])
|
||||||
|
assert payload["target_kind"] == "memory_significance"
|
||||||
|
assert payload["prior_value"] == 0
|
||||||
|
assert payload["new_value"] == 2
|
||||||
|
|
||||||
|
# Projection caught up — five memories at sig=2, the untouched
|
||||||
|
# one stays at sig=1, none remain at sig=0.
|
||||||
|
dist = dict(
|
||||||
|
conn.execute(
|
||||||
|
"SELECT significance, COUNT(*) FROM memories "
|
||||||
|
"WHERE chat_id = 'chat_bot_a' GROUP BY significance"
|
||||||
|
).fetchall()
|
||||||
|
)
|
||||||
|
assert dist.get(0, 0) == 0
|
||||||
|
assert dist.get(1, 0) == 1
|
||||||
|
assert dist.get(2, 0) == 5
|
||||||
|
|
||||||
|
|
||||||
|
def test_delete_turn_with_event_id_zero_returns_400(client, tmp_path):
|
||||||
|
"""T110.1: ``event_id <= 0`` is an obvious client error and must NOT
|
||||||
|
silently rewind the entire log via ``after_event_id = -1``. The route
|
||||||
|
rejects it with 400 so the audit trail stays intact.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "test.db"
|
||||||
|
_seed_chat(db)
|
||||||
|
_seed_turns(db)
|
||||||
|
|
||||||
|
# Sanity: events present before the bad request.
|
||||||
|
with open_db(db) as conn:
|
||||||
|
before = conn.execute("SELECT COUNT(*) FROM event_log").fetchone()[0]
|
||||||
|
assert before > 0
|
||||||
|
|
||||||
|
response = client.post("/chats/chat_bot_a/drawer/turn/delete/0")
|
||||||
|
assert response.status_code == 400
|
||||||
|
|
||||||
|
# And the log was NOT truncated.
|
||||||
|
with open_db(db) as conn:
|
||||||
|
after = conn.execute("SELECT COUNT(*) FROM event_log").fetchone()[0]
|
||||||
|
assert after == before
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# T98.5 — remaining v1 edits (chat narrative anchor + weather).
|
# T98.5 — remaining v1 edits (chat narrative anchor + weather).
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|||||||
@@ -20,6 +20,7 @@ The pseudo path doesn't touch the LLMClient, so we pass an empty
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
import math
|
import math
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
@@ -89,3 +90,81 @@ async def test_generate_embedding_unit_normalized():
|
|||||||
result = await generate_embedding(_client(), text="some non-empty text")
|
result = await generate_embedding(_client(), text="some non-empty text")
|
||||||
norm_sq = sum(x * x for x in result.vector)
|
norm_sq = sum(x * x for x in result.vector)
|
||||||
assert math.isclose(norm_sq, 1.0, abs_tol=1e-6)
|
assert math.isclose(norm_sq, 1.0, abs_tol=1e-6)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_generate_embedding_non_default_model_logs_warning(caplog):
|
||||||
|
"""T107: non-default model falls through to fallback and must warn.
|
||||||
|
|
||||||
|
A Phase 4.5+ caller pointing at a real model that isn't yet wired
|
||||||
|
up would otherwise silently degrade (zero vector → useless cosine).
|
||||||
|
The warning surfaces the misconfiguration in logs.
|
||||||
|
"""
|
||||||
|
caplog.set_level(logging.WARNING, logger="chat.services.embeddings")
|
||||||
|
result = await generate_embedding(_client(), text="hello", model="real-model")
|
||||||
|
|
||||||
|
# Behavior unchanged: still returns the fallback sentinel.
|
||||||
|
assert result.model == FALLBACK_EMBEDDING_MODEL == "fallback"
|
||||||
|
assert all(x == 0.0 for x in result.vector)
|
||||||
|
|
||||||
|
# Warning fired and names the offending model.
|
||||||
|
warnings = [r for r in caplog.records if r.levelno == logging.WARNING]
|
||||||
|
assert any("non-default model" in r.getMessage() for r in warnings)
|
||||||
|
assert any("real-model" in r.getMessage() for r in warnings)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_generate_embedding_default_model_does_not_warn(caplog):
|
||||||
|
"""T107: the silent default path must stay silent."""
|
||||||
|
caplog.set_level(logging.WARNING, logger="chat.services.embeddings")
|
||||||
|
await generate_embedding(_client(), text="hello")
|
||||||
|
warnings = [r for r in caplog.records if r.levelno == logging.WARNING]
|
||||||
|
assert warnings == []
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_embed_routes_to_client_when_non_default_model():
|
||||||
|
"""T112: when a non-default ``model`` is requested, generate_embedding
|
||||||
|
routes through ``client.embed(text, model=...)`` and wraps the
|
||||||
|
returned vector in an EmbeddingResult tagged with the requested
|
||||||
|
model (NOT the fallback sentinel)."""
|
||||||
|
canned = [0.1, 0.2, 0.3, 0.4]
|
||||||
|
client = MockLLMClient(canned=[], canned_embeddings=[canned])
|
||||||
|
|
||||||
|
result = await generate_embedding(
|
||||||
|
client, text="hello world", model="bge-small-en-v1.5"
|
||||||
|
)
|
||||||
|
assert result.vector == canned
|
||||||
|
assert result.model == "bge-small-en-v1.5"
|
||||||
|
assert result.dim == len(canned)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_embed_falls_back_on_client_failure(caplog):
|
||||||
|
"""T112: when ``client.embed`` raises (e.g. NotImplementedError on
|
||||||
|
Featherless, or a transient network error), generate_embedding logs
|
||||||
|
the existing T107 warning and returns the zero-vector fallback so
|
||||||
|
callers detect the sentinel and skip indexing."""
|
||||||
|
|
||||||
|
class _FailingClient:
|
||||||
|
async def generate(self, messages, *, model, **params): # pragma: no cover
|
||||||
|
raise AssertionError("generate must not be called")
|
||||||
|
|
||||||
|
def stream(self, messages, *, model, **params): # pragma: no cover
|
||||||
|
raise AssertionError("stream must not be called")
|
||||||
|
|
||||||
|
async def embed(self, text, *, model):
|
||||||
|
raise NotImplementedError("provider does not expose embeddings")
|
||||||
|
|
||||||
|
caplog.set_level(logging.WARNING, logger="chat.services.embeddings")
|
||||||
|
result = await generate_embedding(
|
||||||
|
_FailingClient(), text="hello", model="bge-small-en-v1.5"
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result.model == FALLBACK_EMBEDDING_MODEL == "fallback"
|
||||||
|
assert len(result.vector) == DEFAULT_EMBEDDING_DIM
|
||||||
|
assert all(x == 0.0 for x in result.vector)
|
||||||
|
|
||||||
|
# Existing T107 warning fires (re-used from the new exception branch).
|
||||||
|
warnings = [r for r in caplog.records if r.levelno == logging.WARNING]
|
||||||
|
assert any("bge-small-en-v1.5" in r.getMessage() for r in warnings)
|
||||||
|
|||||||
@@ -233,3 +233,91 @@ def test_list_active_events_filters_to_planned_and_active(tmp_path):
|
|||||||
|
|
||||||
cancelled = list_events_in_status(conn, "chat_bot_a", "cancelled")
|
cancelled = list_events_in_status(conn, "chat_bot_a", "cancelled")
|
||||||
assert [e["event_id"] for e in cancelled] == ["evt_canx"]
|
assert [e["event_id"] for e in cancelled] == ["evt_canx"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_event_status_reverted_returns_to_prior_status(tmp_path):
|
||||||
|
"""T114.2: ``event_status_reverted`` rolls a row back to ``prior_status``.
|
||||||
|
|
||||||
|
Unlike the forward transitions, this projector handler is
|
||||||
|
unconditional — its sole purpose is to undo a transition, including
|
||||||
|
reverting from a terminal status (completed/cancelled) back to a
|
||||||
|
non-terminal one.
|
||||||
|
|
||||||
|
Three round-trips covered:
|
||||||
|
- completed → active (rollback of an event_completed)
|
||||||
|
- active → planned (rollback of an event_started)
|
||||||
|
- cancelled → active (rollback of an event_cancelled)
|
||||||
|
"""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
_seed_chat(conn)
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="event_planned",
|
||||||
|
payload={
|
||||||
|
"event_id": "evt_revert",
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"kind": "date_at_park",
|
||||||
|
"props": {},
|
||||||
|
"planned_for": "2026-04-30T18:00:00+00:00",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="event_started",
|
||||||
|
payload={
|
||||||
|
"event_id": "evt_revert",
|
||||||
|
"started_at": "2026-04-30T18:01:00+00:00",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
append_event(
|
||||||
|
conn,
|
||||||
|
kind="event_completed",
|
||||||
|
payload={
|
||||||
|
"event_id": "evt_revert",
|
||||||
|
"completed_at": "2026-04-30T20:00:00+00:00",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
project(conn)
|
||||||
|
|
||||||
|
ev = get_event(conn, "evt_revert")
|
||||||
|
assert ev is not None
|
||||||
|
assert ev["status"] == "completed"
|
||||||
|
|
||||||
|
# Revert from completed → active.
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="event_status_reverted",
|
||||||
|
payload={"event_id": "evt_revert", "prior_status": "active"},
|
||||||
|
)
|
||||||
|
ev = get_event(conn, "evt_revert")
|
||||||
|
assert ev["status"] == "active"
|
||||||
|
|
||||||
|
# Revert from active → planned.
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="event_status_reverted",
|
||||||
|
payload={"event_id": "evt_revert", "prior_status": "planned"},
|
||||||
|
)
|
||||||
|
ev = get_event(conn, "evt_revert")
|
||||||
|
assert ev["status"] == "planned"
|
||||||
|
|
||||||
|
# Forward to cancelled, then revert from cancelled → active.
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="event_cancelled",
|
||||||
|
payload={
|
||||||
|
"event_id": "evt_revert",
|
||||||
|
"completed_at": "2026-04-30T20:30:00+00:00",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
ev = get_event(conn, "evt_revert")
|
||||||
|
assert ev["status"] == "cancelled"
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="event_status_reverted",
|
||||||
|
payload={"event_id": "evt_revert", "prior_status": "active"},
|
||||||
|
)
|
||||||
|
ev = get_event(conn, "evt_revert")
|
||||||
|
assert ev["status"] == "active"
|
||||||
|
|||||||
@@ -0,0 +1,35 @@
|
|||||||
|
"""Tests for FeatherlessClient (Phase 4.5+).
|
||||||
|
|
||||||
|
Phase 4.5 adds an ``embed()`` method to the LLMClient Protocol (T112).
|
||||||
|
Featherless's OpenAI-compatible surface routes ``/v1/embeddings`` but
|
||||||
|
every request returns HTTP 500 ``{"type": "completions_error"}`` (the
|
||||||
|
router accepts the URL but the backend has no embedding handler), and
|
||||||
|
``/v1/models`` lists no embedding-class models. The implementation
|
||||||
|
raises ``NotImplementedError`` rather than ship a request that always
|
||||||
|
errors; ``generate_embedding`` catches it and degrades to the
|
||||||
|
zero-vector fallback (the existing T107 warning path).
|
||||||
|
|
||||||
|
If/when Featherless ships embeddings, swap the body for a real call to
|
||||||
|
``/v1/embeddings`` and update this test to mock the HTTP layer.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from chat.llm.featherless import FeatherlessClient
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_featherless_embed_raises_not_implemented():
|
||||||
|
"""Featherless's ``/v1/embeddings`` always 500s with
|
||||||
|
``"completions_error"`` and its model catalog has no embedding
|
||||||
|
class — embed() must raise ``NotImplementedError`` so callers
|
||||||
|
(``generate_embedding``) can degrade to the fallback zero vector
|
||||||
|
+ warning rather than silently producing useless output."""
|
||||||
|
client = FeatherlessClient(api_key="test-key")
|
||||||
|
with pytest.raises(NotImplementedError) as excinfo:
|
||||||
|
await client.embed("hello world", model="bge-small-en-v1.5")
|
||||||
|
# Message should hint at the cause so operators see why their
|
||||||
|
# real-model swap fell back.
|
||||||
|
assert "embeddings" in str(excinfo.value).lower()
|
||||||
@@ -0,0 +1,140 @@
|
|||||||
|
"""Sanity tests for :mod:`tests.fixtures` — the structured CannedQueue
|
||||||
|
builder for ``MockLLMClient`` (T116).
|
||||||
|
|
||||||
|
The builder is a thin shaping layer over JSON dicts; these tests pin
|
||||||
|
the JSON shapes and the ``MockLLMClient`` round-trip so nothing
|
||||||
|
silently regresses if a default field name or shape gets renamed.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from chat.llm.mock import MockLLMClient
|
||||||
|
from tests.fixtures import CannedQueue
|
||||||
|
|
||||||
|
|
||||||
|
def test_canned_queue_build_emits_expected_shapes():
|
||||||
|
"""Each builder method emits the JSON shape its classifier consumer
|
||||||
|
expects. The narrative slot is a bare string (stream).
|
||||||
|
"""
|
||||||
|
canned = (
|
||||||
|
CannedQueue()
|
||||||
|
.parse_turn(segments=[{"kind": "dialogue", "text": "hello"}])
|
||||||
|
.detect_addressee(addressee_id="bot_a", reason="host")
|
||||||
|
.narrative("Hi there.")
|
||||||
|
.state_update()
|
||||||
|
.state_update(affinity_delta=1, trust_delta=2)
|
||||||
|
.detect_interjection(should_interject=False, reason="calm")
|
||||||
|
.detect_event_transitions(
|
||||||
|
[{"event_id": "evt_1", "new_status": "active", "reason": "they arrived"}]
|
||||||
|
)
|
||||||
|
.detect_scene_close(should_close=False, reason="no signal")
|
||||||
|
.summarize_scene_pov(summary="BotA noticed the day winding down.")
|
||||||
|
.detect_threads(
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"action": "open",
|
||||||
|
"title": "Maya's job hunt",
|
||||||
|
"summary": "Maya is looking for a new job",
|
||||||
|
"existing_thread_id": None,
|
||||||
|
}
|
||||||
|
]
|
||||||
|
)
|
||||||
|
.build()
|
||||||
|
)
|
||||||
|
|
||||||
|
# All slots are strings (the MockLLMClient pops strings).
|
||||||
|
assert all(isinstance(slot, str) for slot in canned)
|
||||||
|
assert len(canned) == 10
|
||||||
|
|
||||||
|
# Slot 0: parse_turn — defaults intent="narrative".
|
||||||
|
parse = json.loads(canned[0])
|
||||||
|
assert parse["segments"] == [{"kind": "dialogue", "text": "hello"}]
|
||||||
|
assert parse["intent"] == "narrative"
|
||||||
|
assert parse["landing_state_hint"] == ""
|
||||||
|
|
||||||
|
# Slot 1: detect_addressee.
|
||||||
|
addr = json.loads(canned[1])
|
||||||
|
assert addr["addressee_id"] == "bot_a"
|
||||||
|
assert addr["confidence"] == "medium"
|
||||||
|
assert addr["reason"] == "host"
|
||||||
|
|
||||||
|
# Slot 2: narrative — bare string, NOT JSON.
|
||||||
|
assert canned[2] == "Hi there."
|
||||||
|
with pytest.raises(json.JSONDecodeError):
|
||||||
|
json.loads(canned[2])
|
||||||
|
|
||||||
|
# Slot 3: state_update with all defaults — zero deltas, no facts.
|
||||||
|
su0 = json.loads(canned[3])
|
||||||
|
assert su0 == {"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
|
||||||
|
|
||||||
|
# Slot 4: state_update with custom deltas.
|
||||||
|
su1 = json.loads(canned[4])
|
||||||
|
assert su1["affinity_delta"] == 1
|
||||||
|
assert su1["trust_delta"] == 2
|
||||||
|
assert su1["knowledge_facts"] == []
|
||||||
|
|
||||||
|
# Slot 5: detect_interjection.
|
||||||
|
interj = json.loads(canned[5])
|
||||||
|
assert interj == {"should_interject": False, "reason": "calm"}
|
||||||
|
|
||||||
|
# Slot 6: detect_event_transitions.
|
||||||
|
transitions = json.loads(canned[6])
|
||||||
|
assert transitions["transitions"][0]["event_id"] == "evt_1"
|
||||||
|
assert transitions["transitions"][0]["new_status"] == "active"
|
||||||
|
|
||||||
|
# Slot 7: detect_scene_close.
|
||||||
|
close = json.loads(canned[7])
|
||||||
|
assert close == {"should_close": False, "reason": "no signal"}
|
||||||
|
|
||||||
|
# Slot 8: summarize_scene_pov.
|
||||||
|
pov = json.loads(canned[8])
|
||||||
|
assert pov["summary"] == "BotA noticed the day winding down."
|
||||||
|
assert pov["knowledge_facts"] == []
|
||||||
|
assert pov["relationship_summary"] == ""
|
||||||
|
|
||||||
|
# Slot 9: detect_threads.
|
||||||
|
threads = json.loads(canned[9])
|
||||||
|
assert threads["candidates"][0]["action"] == "open"
|
||||||
|
assert threads["candidates"][0]["title"] == "Maya's job hunt"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_canned_queue_round_trips_through_mock_llm_client():
|
||||||
|
"""Building a queue and feeding it to ``MockLLMClient`` produces the
|
||||||
|
same items back via ``generate`` (in order). This is the contract
|
||||||
|
every migrated test relies on.
|
||||||
|
"""
|
||||||
|
canned = (
|
||||||
|
CannedQueue()
|
||||||
|
.parse_turn(segments=[{"kind": "dialogue", "text": "hi"}])
|
||||||
|
.narrative("Hello back.")
|
||||||
|
.state_update()
|
||||||
|
.build()
|
||||||
|
)
|
||||||
|
mock = MockLLMClient(canned=canned)
|
||||||
|
|
||||||
|
# generate() pops from the front.
|
||||||
|
parse_str = await mock.generate([], model="x")
|
||||||
|
assert json.loads(parse_str)["segments"] == [
|
||||||
|
{"kind": "dialogue", "text": "hi"}
|
||||||
|
]
|
||||||
|
|
||||||
|
# The narrative slot is a raw string — generate returns it as-is.
|
||||||
|
narr_str = await mock.generate([], model="x")
|
||||||
|
assert narr_str == "Hello back."
|
||||||
|
|
||||||
|
# The state_update slot has zero-delta defaults.
|
||||||
|
su_str = await mock.generate([], model="x")
|
||||||
|
assert json.loads(su_str) == {
|
||||||
|
"affinity_delta": 0,
|
||||||
|
"trust_delta": 0,
|
||||||
|
"knowledge_facts": [],
|
||||||
|
}
|
||||||
|
|
||||||
|
# Queue fully drained.
|
||||||
|
with pytest.raises(IndexError):
|
||||||
|
await mock.generate([], model="x")
|
||||||
@@ -19,3 +19,28 @@ async def test_mock_streams_tokens():
|
|||||||
async for chunk in client.stream(msgs, model="any"):
|
async for chunk in client.stream(msgs, model="any"):
|
||||||
chunks.append(chunk)
|
chunks.append(chunk)
|
||||||
assert "".join(chunks) == "abcd"
|
assert "".join(chunks) == "abcd"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_mock_llm_client_embed_pops_canned():
|
||||||
|
"""T112: MockLLMClient.embed() pops a canned vector from the front
|
||||||
|
of ``canned_embeddings`` (mirrors the existing ``canned`` queue
|
||||||
|
pattern for generate/stream)."""
|
||||||
|
v1 = [0.1, 0.2, 0.3]
|
||||||
|
v2 = [0.4, 0.5, 0.6]
|
||||||
|
client = MockLLMClient(canned=[], canned_embeddings=[v1, v2])
|
||||||
|
|
||||||
|
out1 = await client.embed("first", model="bge-small-en-v1.5")
|
||||||
|
out2 = await client.embed("second", model="bge-small-en-v1.5")
|
||||||
|
assert out1 == v1
|
||||||
|
assert out2 == v2
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_mock_llm_client_embed_empty_queue_raises():
|
||||||
|
"""When the canned_embeddings queue is empty, ``embed`` must raise
|
||||||
|
a clear failure (IndexError) so misconfigured tests don't silently
|
||||||
|
return None or hang."""
|
||||||
|
client = MockLLMClient(canned=[])
|
||||||
|
with pytest.raises(IndexError):
|
||||||
|
await client.embed("text", model="any")
|
||||||
|
|||||||
@@ -0,0 +1,84 @@
|
|||||||
|
"""Tests for LocalMLXClient (Phase 4.5+).
|
||||||
|
|
||||||
|
Talks to a local mlx-omni-server over the OpenAI-compatible surface.
|
||||||
|
We don't spin up a real server in tests — instead we monkey-patch the
|
||||||
|
underlying ``AsyncOpenAI`` instance to assert on the request shape and
|
||||||
|
return canned responses. The semaphore behavior is shared with
|
||||||
|
FeatherlessClient (same pattern), so we don't re-test that here.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from types import SimpleNamespace
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from chat.llm.client import Message
|
||||||
|
from chat.llm.local_mlx import LocalMLXClient
|
||||||
|
|
||||||
|
|
||||||
|
class _FakeChatCompletions:
|
||||||
|
def __init__(self, response):
|
||||||
|
self.response = response
|
||||||
|
self.calls = []
|
||||||
|
|
||||||
|
async def create(self, **kw):
|
||||||
|
self.calls.append(kw)
|
||||||
|
return self.response
|
||||||
|
|
||||||
|
|
||||||
|
class _FakeEmbeddings:
|
||||||
|
def __init__(self, vector):
|
||||||
|
self.vector = vector
|
||||||
|
self.calls = []
|
||||||
|
|
||||||
|
async def create(self, **kw):
|
||||||
|
self.calls.append(kw)
|
||||||
|
return SimpleNamespace(data=[SimpleNamespace(embedding=self.vector)])
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_local_mlx_client_generate_calls_chat_completions():
|
||||||
|
client = LocalMLXClient(base_url="http://localhost:10240/v1")
|
||||||
|
fake_response = SimpleNamespace(
|
||||||
|
choices=[SimpleNamespace(message=SimpleNamespace(content="hello"))]
|
||||||
|
)
|
||||||
|
fake_chat = _FakeChatCompletions(fake_response)
|
||||||
|
client._client.chat = SimpleNamespace(completions=fake_chat)
|
||||||
|
|
||||||
|
out = await client.generate(
|
||||||
|
[Message(role="user", content="hi")],
|
||||||
|
model="mlx-community/Hermes-3-Llama-3.1-8B-8bit",
|
||||||
|
)
|
||||||
|
|
||||||
|
assert out == "hello"
|
||||||
|
assert len(fake_chat.calls) == 1
|
||||||
|
assert fake_chat.calls[0]["model"] == "mlx-community/Hermes-3-Llama-3.1-8B-8bit"
|
||||||
|
assert fake_chat.calls[0]["messages"] == [{"role": "user", "content": "hi"}]
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_local_mlx_client_embed_returns_vector():
|
||||||
|
"""``embed()`` actually works on this client (unlike FeatherlessClient
|
||||||
|
which raises NotImplementedError) — the local MLX server has a real
|
||||||
|
``/v1/embeddings`` endpoint backed by an MLX-quantized model.
|
||||||
|
"""
|
||||||
|
client = LocalMLXClient()
|
||||||
|
canned = [0.1, 0.2, 0.3, 0.4]
|
||||||
|
fake_embeddings = _FakeEmbeddings(canned)
|
||||||
|
client._client.embeddings = fake_embeddings
|
||||||
|
|
||||||
|
out = await client.embed("hello", model="mlx-community/bge-small-en-v1.5-bf16")
|
||||||
|
|
||||||
|
assert out == canned
|
||||||
|
assert fake_embeddings.calls[0]["model"] == "mlx-community/bge-small-en-v1.5-bf16"
|
||||||
|
assert fake_embeddings.calls[0]["input"] == "hello"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_local_mlx_client_default_base_url():
|
||||||
|
"""Default base_url targets ``mlx-omni-server`` on its standard port."""
|
||||||
|
client = LocalMLXClient()
|
||||||
|
# AsyncOpenAI normalizes trailing-slash differences; just check the
|
||||||
|
# configured host:port appears in the underlying client config.
|
||||||
|
assert "127.0.0.1:10240" in str(client._client.base_url)
|
||||||
@@ -586,3 +586,59 @@ def test_record_turn_memory_enqueues_embedding_job(tmp_path):
|
|||||||
assert {job.memory_id for job in captured} == expected_ids
|
assert {job.memory_id for job in captured} == expected_ids
|
||||||
for job in captured:
|
for job in captured:
|
||||||
assert job.text == "Both bots witness this beat."
|
assert job.text == "Both bots witness this beat."
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# T109: memories.event_id deep-link column populated by the projector.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_memory_written_populates_event_id(tmp_path):
|
||||||
|
"""Schema 0014 added ``memories.event_id`` referencing ``event_log.id``.
|
||||||
|
|
||||||
|
The ``memory_written`` projector handler must populate the column with
|
||||||
|
the projecting event's id so T111 can deep-link cross-chat search hits
|
||||||
|
back to the originating turn.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
_seed_minimal(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
result = record_turn_memory_for_present(
|
||||||
|
conn,
|
||||||
|
chat_id="chat_bot_a",
|
||||||
|
host_bot_id="bot_a",
|
||||||
|
guest_bot_id=None,
|
||||||
|
narrative_text="BotA shrugs.",
|
||||||
|
)
|
||||||
|
eid, mid = result["bot_a"]
|
||||||
|
assert eid > 0 and mid is not None
|
||||||
|
|
||||||
|
row = conn.execute(
|
||||||
|
"SELECT event_id FROM memories WHERE id = ?", (mid,)
|
||||||
|
).fetchone()
|
||||||
|
assert row is not None
|
||||||
|
assert row[0] == eid
|
||||||
|
|
||||||
|
|
||||||
|
def test_memory_event_id_column_is_nullable_for_backfill(tmp_path):
|
||||||
|
"""Backward compat: the ``event_id`` column is nullable so historical
|
||||||
|
memory rows projected before 0014 ran (or rows synthesised by tests
|
||||||
|
that bypass the projector) don't break the schema. A direct INSERT
|
||||||
|
omitting the column must succeed and read back NULL."""
|
||||||
|
db = tmp_path / "t.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
_seed_minimal(db)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
conn.execute(
|
||||||
|
"INSERT INTO memories ("
|
||||||
|
"owner_id, chat_id, pov_summary, "
|
||||||
|
"witness_you, witness_host, witness_guest"
|
||||||
|
") VALUES (?, ?, ?, ?, ?, ?)",
|
||||||
|
("bot_a", "chat_bot_a", "legacy row", 1, 1, 0),
|
||||||
|
)
|
||||||
|
row = conn.execute(
|
||||||
|
"SELECT event_id FROM memories WHERE pov_summary = 'legacy row'"
|
||||||
|
).fetchone()
|
||||||
|
assert row is not None
|
||||||
|
assert row[0] is None
|
||||||
|
|||||||
@@ -0,0 +1,767 @@
|
|||||||
|
"""Phase 4.5 cross-feature integration tests (T117).
|
||||||
|
|
||||||
|
End-to-end multi-feature flows specific to the Phase 4.5 changes
|
||||||
|
(T103-T114). Mirrors :mod:`tests.test_phase4_integration` in shape:
|
||||||
|
each test drives multiple Phase 4.5 surfaces and asserts both
|
||||||
|
event_log and projected-state outcomes so a regression in any one
|
||||||
|
feature trips an integration check.
|
||||||
|
|
||||||
|
Test inventory:
|
||||||
|
|
||||||
|
1. ``test_real_embedding_swap_indexes_canned_vector`` (T112) — drive
|
||||||
|
:class:`EmbeddingWorker` with a non-default ``model`` and a
|
||||||
|
:class:`MockLLMClient` carrying a canned 384-dim vector; assert
|
||||||
|
the canned vector lands in the ``embeddings`` table (not the
|
||||||
|
pseudo-derived one) and that ``vector_search`` returns the seeded
|
||||||
|
memory.
|
||||||
|
2. ``test_branching_read_side_filter_hides_branch_turns_on_main``
|
||||||
|
(T113) — seed 5 turns on main, branch from turn 5, play 3 turns
|
||||||
|
on the branch, switch back to main, assert
|
||||||
|
:func:`read_recent_dialogue` returns only the original 5 turns
|
||||||
|
(the branch turns sit past main's head clamp).
|
||||||
|
3. ``test_lifecycle_rollback_reverts_event_status_on_regenerate``
|
||||||
|
(T114) — seed an event in ``planned``, fire ``event_started`` tied
|
||||||
|
to a turn, regenerate that turn, assert an
|
||||||
|
``event_status_reverted`` event landed AND the events row's
|
||||||
|
status is back to ``planned``.
|
||||||
|
4. ``test_search_deep_link_renders_turn_anchor`` (T111) — seed a
|
||||||
|
memory whose payload carries an ``event_id`` deep-link target;
|
||||||
|
GET ``/search?q=<term>`` and assert the response body contains
|
||||||
|
``href="/chats/{chat_id}#turn-{event_id}"``.
|
||||||
|
5. ``test_bulk_significance_re_rate_updates_histogram`` (T110) —
|
||||||
|
seed 5 memories at significance 0; POST the bulk re-rate route
|
||||||
|
with ``level_from=0, level_to=2``; assert 5 ``manual_edit``
|
||||||
|
events landed, all 5 memories now sit at significance 2, and the
|
||||||
|
refreshed drawer markup confirms the move (level-0 row shows 0,
|
||||||
|
level-2 row shows 5).
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
from pathlib import Path
|
||||||
|
from types import SimpleNamespace
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from fastapi.testclient import TestClient
|
||||||
|
|
||||||
|
from chat.app import app
|
||||||
|
from chat.db.connection import open_db
|
||||||
|
from chat.db.migrate import apply_migrations
|
||||||
|
from chat.eventlog.log import append_and_apply, append_event
|
||||||
|
from chat.eventlog.projector import project
|
||||||
|
from chat.llm.mock import MockLLMClient
|
||||||
|
|
||||||
|
# Trigger projector handler registration. Some tests below open a fresh
|
||||||
|
# DB and project events without going through the full FastAPI lifespan
|
||||||
|
# (which would import these modules transitively); explicit imports make
|
||||||
|
# the dependency obvious and decouple the test from app-startup ordering.
|
||||||
|
import chat.state.branches # noqa: F401
|
||||||
|
import chat.state.embeddings # noqa: F401
|
||||||
|
import chat.state.entities # noqa: F401
|
||||||
|
import chat.state.events # noqa: F401
|
||||||
|
import chat.state.manual_edit # noqa: F401
|
||||||
|
import chat.state.memory # noqa: F401
|
||||||
|
import chat.state.world # noqa: F401
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Shared fixtures + seed helpers (mirroring test_phase4_integration.py).
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def app_state_setup(tmp_path, monkeypatch):
|
||||||
|
"""TestClient against the live FastAPI app with a tmp DB.
|
||||||
|
|
||||||
|
Identical shape to :mod:`tests.test_phase4_integration` so the
|
||||||
|
Phase 4.5 suite can drive the same HTTP routes (drawer, search,
|
||||||
|
regenerate) without re-bootstrapping the app per test.
|
||||||
|
"""
|
||||||
|
cfg = tmp_path / "config.toml"
|
||||||
|
cfg.write_text('featherless_api_key = "test"\n')
|
||||||
|
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||||
|
db = tmp_path / "test.db"
|
||||||
|
monkeypatch.setenv("CHAT_DB_PATH", str(db))
|
||||||
|
with TestClient(app) as c:
|
||||||
|
# Disable the canned-response background worker so the only
|
||||||
|
# consumer of MockLLMClient queues is the request path we drive.
|
||||||
|
app.state.background_worker.enabled = False
|
||||||
|
yield c
|
||||||
|
app.dependency_overrides.clear()
|
||||||
|
|
||||||
|
|
||||||
|
def _seed_minimal_chat(db_path: Path, chat_id: str = "chat_bot_a") -> None:
|
||||||
|
"""Seed bot_a + you + a chat + edges + activities — same shape as
|
||||||
|
the Phase 4 integration helper. ``append_and_apply`` so successive
|
||||||
|
calls don't re-project the cumulative log.
|
||||||
|
"""
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
existing_bot = conn.execute(
|
||||||
|
"SELECT 1 FROM bots WHERE id = 'bot_a'"
|
||||||
|
).fetchone()
|
||||||
|
if existing_bot is None:
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="bot_authored",
|
||||||
|
payload={
|
||||||
|
"id": "bot_a",
|
||||||
|
"name": "BotA",
|
||||||
|
"persona": "thoughtful",
|
||||||
|
"voice_samples": [],
|
||||||
|
"traits": [],
|
||||||
|
"backstory": "",
|
||||||
|
"initial_relationship_to_you": "",
|
||||||
|
"kickoff_prose": "...",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="you_authored",
|
||||||
|
payload={
|
||||||
|
"name": "Me",
|
||||||
|
"pronouns": "they/them",
|
||||||
|
"persona": "",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="chat_created",
|
||||||
|
payload={
|
||||||
|
"id": chat_id,
|
||||||
|
"host_bot_id": "bot_a",
|
||||||
|
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||||
|
"narrative_anchor": "Day 1",
|
||||||
|
"weather": "",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="edge_update",
|
||||||
|
payload={
|
||||||
|
"source_id": "bot_a",
|
||||||
|
"target_id": "you",
|
||||||
|
"chat_id": chat_id,
|
||||||
|
"knowledge_facts": [],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
if existing_bot is None:
|
||||||
|
for entity_id, verb in [
|
||||||
|
("you", "talking"),
|
||||||
|
("bot_a", "listening"),
|
||||||
|
]:
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="activity_change",
|
||||||
|
payload={
|
||||||
|
"entity_id": entity_id,
|
||||||
|
"posture": "sitting",
|
||||||
|
"action": {
|
||||||
|
"verb": verb,
|
||||||
|
"interruptible": True,
|
||||||
|
"required_attention": "low",
|
||||||
|
"expected_duration": "ongoing",
|
||||||
|
},
|
||||||
|
"attention": "",
|
||||||
|
"holding": [],
|
||||||
|
"status": {},
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 1. Real embedding swap (T112) — non-default model routes through
|
||||||
|
# ``client.embed`` and the canned vector lands in the embeddings table.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_real_embedding_swap_indexes_canned_vector(tmp_path):
|
||||||
|
"""T112: swapping ``model`` from the pseudo default to a real model
|
||||||
|
routes the embedding generation through ``client.embed`` instead of
|
||||||
|
the local hash-derived path.
|
||||||
|
|
||||||
|
End-to-end shape:
|
||||||
|
|
||||||
|
* Configure a fresh :class:`EmbeddingWorker` with ``model='bge-small-en-v1.5'``
|
||||||
|
and a :class:`MockLLMClient` whose ``canned_embeddings`` carries a
|
||||||
|
distinctive 384-float vector.
|
||||||
|
* Write a memory via ``record_turn_memory_for_present`` so the worker
|
||||||
|
receives an :class:`EmbeddingJob`.
|
||||||
|
* Drain the worker (sentinel-based stop).
|
||||||
|
* Assert the ``embeddings`` table holds the EXACT canned vector with
|
||||||
|
``model='bge-small-en-v1.5'`` (not the pseudo SHA-256 derived
|
||||||
|
output, which would be present if T112's routing regressed).
|
||||||
|
* Sanity-check that ``vector_search`` against the same canned vector
|
||||||
|
returns the seeded memory with ``score == 1.0`` (cosine self-match).
|
||||||
|
|
||||||
|
Why no FastAPI lifespan: the live ``app.state.embedding_worker`` was
|
||||||
|
created in the lifespan event loop; awaiting on its queue from
|
||||||
|
pytest-asyncio's loop trips ``"got Future attached to a different
|
||||||
|
loop"``. Mirrors the pattern in
|
||||||
|
``tests/test_phase4_integration.py::test_vector_retrieval_feedback_loop``.
|
||||||
|
"""
|
||||||
|
from chat.services.embedding_worker import EmbeddingWorker
|
||||||
|
from chat.services.memory_write import record_turn_memory_for_present
|
||||||
|
from chat.services.vector_search import vector_search
|
||||||
|
|
||||||
|
db = tmp_path / "test.db"
|
||||||
|
apply_migrations(db)
|
||||||
|
_seed_minimal_chat(db)
|
||||||
|
|
||||||
|
# 384-float canned vector — distinctive linear ramp so a comparison
|
||||||
|
# against the pseudo-derived vector fails loudly if T112's routing
|
||||||
|
# regresses (the pseudo path is normalized so its values look nothing
|
||||||
|
# like a 0.000..0.383 ramp).
|
||||||
|
canned_vector = [i / 1000.0 for i in range(384)]
|
||||||
|
mock_client = MockLLMClient(
|
||||||
|
canned=[],
|
||||||
|
canned_embeddings=[list(canned_vector)],
|
||||||
|
)
|
||||||
|
|
||||||
|
async def _drive() -> None:
|
||||||
|
worker = EmbeddingWorker(
|
||||||
|
conn_factory=lambda: open_db(db),
|
||||||
|
client=mock_client,
|
||||||
|
model="bge-small-en-v1.5", # T112: non-default routes via embed()
|
||||||
|
dim=384,
|
||||||
|
)
|
||||||
|
await worker.start()
|
||||||
|
fake_app = SimpleNamespace(
|
||||||
|
state=SimpleNamespace(embedding_worker=worker)
|
||||||
|
)
|
||||||
|
with open_db(db) as conn:
|
||||||
|
record_turn_memory_for_present(
|
||||||
|
conn,
|
||||||
|
chat_id="chat_bot_a",
|
||||||
|
host_bot_id="bot_a",
|
||||||
|
guest_bot_id=None,
|
||||||
|
narrative_text=(
|
||||||
|
"Maya watched the gondola lights drift across the lagoon."
|
||||||
|
),
|
||||||
|
app=fake_app,
|
||||||
|
)
|
||||||
|
await worker.stop()
|
||||||
|
|
||||||
|
asyncio.run(_drive())
|
||||||
|
|
||||||
|
with open_db(db) as conn:
|
||||||
|
emb_rows = conn.execute(
|
||||||
|
"SELECT memory_id, vector_json, model, dim FROM embeddings"
|
||||||
|
).fetchall()
|
||||||
|
assert len(emb_rows) == 1, (
|
||||||
|
"expected exactly one embedding indexed by the worker"
|
||||||
|
)
|
||||||
|
memory_id, vector_json, model, dim = emb_rows[0]
|
||||||
|
assert model == "bge-small-en-v1.5", (
|
||||||
|
f"expected non-default model tag, got {model!r}"
|
||||||
|
)
|
||||||
|
assert dim == 384
|
||||||
|
stored_vector = json.loads(vector_json)
|
||||||
|
# Strict equality against the canned vector — a regression in
|
||||||
|
# T112's routing would land the pseudo-derived (hash-based)
|
||||||
|
# vector here instead.
|
||||||
|
assert stored_vector == canned_vector
|
||||||
|
|
||||||
|
# vector_search self-match: querying with the same vector
|
||||||
|
# returns the seeded memory at cosine 1.0.
|
||||||
|
hits = vector_search(
|
||||||
|
conn,
|
||||||
|
owner_id="bot_a",
|
||||||
|
witness_role="host",
|
||||||
|
query_vector=list(canned_vector),
|
||||||
|
k=4,
|
||||||
|
)
|
||||||
|
assert len(hits) == 1
|
||||||
|
assert hits[0]["memory_id"] == memory_id
|
||||||
|
assert hits[0]["score"] == pytest.approx(1.0, abs=1e-9)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 2. Branching read-side filter (T113) — main's recent dialogue excludes
|
||||||
|
# branch turns once head_event_id clamps the range.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_branching_read_side_filter_hides_branch_turns_on_main(
|
||||||
|
app_state_setup, tmp_path
|
||||||
|
):
|
||||||
|
"""T113: switching the active branch changes what
|
||||||
|
:func:`read_recent_dialogue` sees.
|
||||||
|
|
||||||
|
Setup:
|
||||||
|
|
||||||
|
* Seed 5 turns on main. Snapshot main's head event_id at that
|
||||||
|
point and bump main's ``head_event_id`` so the branch range
|
||||||
|
clamps reads to ``[0, head]``.
|
||||||
|
* Branch from turn 5; switch to the experiment branch; play 3
|
||||||
|
turns on it.
|
||||||
|
* Switch back to main.
|
||||||
|
|
||||||
|
Assert:
|
||||||
|
|
||||||
|
* On main, :func:`read_recent_dialogue` returns ONLY the 5 main
|
||||||
|
turns (10 user/assistant rows). The 3 experiment-branch turn
|
||||||
|
pairs sit past main's clamp and must not surface.
|
||||||
|
* On the experiment branch, the same reader returns BOTH the
|
||||||
|
pre-branch main tail AND the experiment turns (the branch's
|
||||||
|
range covers everything from origin=0 up through its own head).
|
||||||
|
|
||||||
|
Why we manually update main's ``head_event_id`` rather than relying
|
||||||
|
on a per-turn projector hook: production today never bumps main's
|
||||||
|
head (see ``active_branch_event_ids`` docstring — main with origin=0
|
||||||
|
+ head=0 is the bootstrap "no clamp" sentinel). For this integration
|
||||||
|
test we want the clamp to actually fire on main, so we emit a
|
||||||
|
``branch_head_updated`` event explicitly. This mirrors what a
|
||||||
|
future "main head tracker" would do.
|
||||||
|
"""
|
||||||
|
from chat.services.branching import (
|
||||||
|
branch_from_event,
|
||||||
|
switch_active_branch,
|
||||||
|
)
|
||||||
|
from chat.services.turn_common import read_recent_dialogue
|
||||||
|
from chat.state.branches import active_branch
|
||||||
|
|
||||||
|
db = tmp_path / "test.db"
|
||||||
|
_seed_minimal_chat(db)
|
||||||
|
|
||||||
|
main_assistant_ids: list[int] = []
|
||||||
|
with open_db(db) as conn:
|
||||||
|
for i in range(1, 6):
|
||||||
|
user_id = append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="user_turn",
|
||||||
|
payload={
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"prose": f"main turn {i}",
|
||||||
|
"segments": [],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
asst_id = append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="assistant_turn",
|
||||||
|
payload={
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"speaker_id": "bot_a",
|
||||||
|
"text": f"main reply {i}",
|
||||||
|
"truncated": False,
|
||||||
|
"user_turn_id": user_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
main_assistant_ids.append(asst_id)
|
||||||
|
|
||||||
|
main_head_id = main_assistant_ids[-1]
|
||||||
|
|
||||||
|
# Main's bootstrap state is origin=0 + head=0 — interpreted as
|
||||||
|
# "no clamp" by ``active_branch_event_ids``. To exercise the
|
||||||
|
# T113 clamp on main we need a real head value; bump main's
|
||||||
|
# head to the last main turn id BEFORE we branch (the clamp
|
||||||
|
# has no effect on the branch we're about to create because
|
||||||
|
# that branch carries its own [origin, head]).
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="branch_head_updated",
|
||||||
|
payload={"name": "main", "head_event_id": main_head_id},
|
||||||
|
)
|
||||||
|
|
||||||
|
# Fork point: turn 5's assistant_turn id.
|
||||||
|
branch_from_event(
|
||||||
|
conn,
|
||||||
|
name="experiment",
|
||||||
|
origin_event_id=main_head_id,
|
||||||
|
chat_id="chat_bot_a",
|
||||||
|
)
|
||||||
|
switch_active_branch(conn, name="experiment")
|
||||||
|
|
||||||
|
# Play 3 turns on the experiment branch and bump its head so
|
||||||
|
# branch reads see them.
|
||||||
|
experiment_assistant_ids: list[int] = []
|
||||||
|
for i in range(1, 4):
|
||||||
|
user_id = append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="user_turn",
|
||||||
|
payload={
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"prose": f"experiment turn {i}",
|
||||||
|
"segments": [],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
asst_id = append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="assistant_turn",
|
||||||
|
payload={
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"speaker_id": "bot_a",
|
||||||
|
"text": f"experiment reply {i}",
|
||||||
|
"truncated": False,
|
||||||
|
"user_turn_id": user_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
experiment_assistant_ids.append(asst_id)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="branch_head_updated",
|
||||||
|
payload={
|
||||||
|
"name": "experiment",
|
||||||
|
"head_event_id": experiment_assistant_ids[-1],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
# Branch reader: covers origin..head, so it sees BOTH main's
|
||||||
|
# pre-fork tail and the experiment turns.
|
||||||
|
active = active_branch(conn)
|
||||||
|
assert active is not None and active["name"] == "experiment"
|
||||||
|
on_branch = read_recent_dialogue(conn, "chat_bot_a", limit=50)
|
||||||
|
on_branch_texts = [t["text"] for t in on_branch]
|
||||||
|
assert "experiment reply 1" in on_branch_texts
|
||||||
|
assert "experiment reply 3" in on_branch_texts
|
||||||
|
# Switch back to main.
|
||||||
|
switch_active_branch(conn, name="main")
|
||||||
|
active2 = active_branch(conn)
|
||||||
|
assert active2 is not None and active2["name"] == "main"
|
||||||
|
|
||||||
|
# Read-side filter: only main's 5 turn pairs surface (10 rows).
|
||||||
|
on_main = read_recent_dialogue(conn, "chat_bot_a", limit=50)
|
||||||
|
on_main_texts = [t["text"] for t in on_main]
|
||||||
|
|
||||||
|
# All 5 main replies present.
|
||||||
|
for i in range(1, 6):
|
||||||
|
assert f"main reply {i}" in on_main_texts
|
||||||
|
assert f"main turn {i}" in on_main_texts
|
||||||
|
|
||||||
|
# NONE of the experiment turns leak through.
|
||||||
|
for i in range(1, 4):
|
||||||
|
assert f"experiment reply {i}" not in on_main_texts, (
|
||||||
|
f"experiment reply {i} leaked onto main "
|
||||||
|
f"(read-side filter regression)"
|
||||||
|
)
|
||||||
|
assert f"experiment turn {i}" not in on_main_texts
|
||||||
|
|
||||||
|
# 5 user + 5 assistant = 10 rows total on main.
|
||||||
|
assert len(on_main) == 10
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 3. Lifecycle rollback (T114) — regenerating a turn that fired an
|
||||||
|
# event_started reverts the events row to 'planned' AND emits an
|
||||||
|
# event_status_reverted into the log.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_lifecycle_rollback_reverts_event_status_on_regenerate(
|
||||||
|
tmp_path, monkeypatch
|
||||||
|
):
|
||||||
|
"""T114: when the superseded turn fired ``event_started`` (with the
|
||||||
|
T114.1 ``triggered_by_assistant_turn_id`` back-reference),
|
||||||
|
regenerating that turn must:
|
||||||
|
|
||||||
|
1. Append an ``event_status_reverted`` event with ``prior_status='planned'``.
|
||||||
|
2. Project the events row's status back to ``planned``.
|
||||||
|
|
||||||
|
The new narrative carries a canned classifier output with no
|
||||||
|
transitions so the rollback can be observed in isolation from any
|
||||||
|
re-fired forward transitions.
|
||||||
|
|
||||||
|
Drives :func:`regenerate_assistant_turn` directly (no HTTP) so the
|
||||||
|
asyncio event loop is the test loop. Mirrors the unit-test
|
||||||
|
pattern in :mod:`tests.test_regenerate`.
|
||||||
|
"""
|
||||||
|
from chat.config import Settings
|
||||||
|
from chat.services.regenerate import regenerate_assistant_turn
|
||||||
|
|
||||||
|
cfg = tmp_path / "config.toml"
|
||||||
|
cfg.write_text('featherless_api_key = "test"\n')
|
||||||
|
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||||
|
db = tmp_path / "test.db"
|
||||||
|
monkeypatch.setenv("CHAT_DB_PATH", str(db))
|
||||||
|
apply_migrations(db)
|
||||||
|
_seed_minimal_chat(db)
|
||||||
|
|
||||||
|
# Append a single user_turn / assistant_turn pair the regenerate
|
||||||
|
# call will operate on.
|
||||||
|
with open_db(db) as conn:
|
||||||
|
user_turn_id = append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="user_turn",
|
||||||
|
payload={
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"prose": "lights up",
|
||||||
|
"segments": [],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
assistant_turn_id = append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="assistant_turn",
|
||||||
|
payload={
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"speaker_id": "bot_a",
|
||||||
|
"text": "Maya nods.",
|
||||||
|
"truncated": False,
|
||||||
|
"user_turn_id": user_turn_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
# Seed a planned event, then transition it to active with the
|
||||||
|
# T114.1 back-reference pointing at the assistant_turn we'll
|
||||||
|
# regenerate.
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="event_planned",
|
||||||
|
payload={
|
||||||
|
"event_id": "evt_party",
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"kind": "story_event",
|
||||||
|
"props": {},
|
||||||
|
"planned_for": "2026-04-30T18:00:00+00:00",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="event_started",
|
||||||
|
payload={
|
||||||
|
"event_id": "evt_party",
|
||||||
|
"started_at": "2026-04-30T19:00:00+00:00",
|
||||||
|
"triggered_by_assistant_turn_id": assistant_turn_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
# Sanity: the events row is currently 'active'.
|
||||||
|
status_before = conn.execute(
|
||||||
|
"SELECT status FROM events WHERE event_id = ?",
|
||||||
|
("evt_party",),
|
||||||
|
).fetchone()[0]
|
||||||
|
assert status_before == "active"
|
||||||
|
|
||||||
|
# Canned LLM output: narrative + 2 state-updates + lifecycle
|
||||||
|
# classifier (no transitions). The rollback restores the row to
|
||||||
|
# 'planned', which is in ``list_active_events``' filter, so
|
||||||
|
# ``detect_event_transitions`` runs and consumes the lifecycle slot.
|
||||||
|
state_canned = json.dumps(
|
||||||
|
{"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
|
||||||
|
)
|
||||||
|
no_transitions = json.dumps({"transitions": []})
|
||||||
|
mock_client = MockLLMClient(
|
||||||
|
canned=[
|
||||||
|
"Maya gestures.", # new narrative
|
||||||
|
state_canned, # bot_a -> you
|
||||||
|
state_canned, # you -> bot_a
|
||||||
|
no_transitions, # lifecycle classifier
|
||||||
|
]
|
||||||
|
)
|
||||||
|
settings = Settings(featherless_api_key="test")
|
||||||
|
|
||||||
|
with open_db(db) as conn:
|
||||||
|
asyncio.run(
|
||||||
|
regenerate_assistant_turn(
|
||||||
|
conn,
|
||||||
|
mock_client,
|
||||||
|
settings=settings,
|
||||||
|
chat_id="chat_bot_a",
|
||||||
|
original_assistant_event_id=assistant_turn_id,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
with open_db(db) as conn:
|
||||||
|
# 1. The event_status_reverted event lands with prior_status='planned'.
|
||||||
|
rev_rows = conn.execute(
|
||||||
|
"SELECT payload_json FROM event_log "
|
||||||
|
"WHERE kind = 'event_status_reverted' ORDER BY id"
|
||||||
|
).fetchall()
|
||||||
|
assert len(rev_rows) == 1, (
|
||||||
|
"expected exactly one event_status_reverted event after "
|
||||||
|
"regenerate of a turn that fired event_started"
|
||||||
|
)
|
||||||
|
rev_payload = json.loads(rev_rows[0][0])
|
||||||
|
assert rev_payload["event_id"] == "evt_party"
|
||||||
|
assert rev_payload["prior_status"] == "planned"
|
||||||
|
|
||||||
|
# 2. The events row is back to 'planned' (rolled back from 'active').
|
||||||
|
status_after = conn.execute(
|
||||||
|
"SELECT status FROM events WHERE event_id = ?",
|
||||||
|
("evt_party",),
|
||||||
|
).fetchone()[0]
|
||||||
|
assert status_after == "planned"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 4. Search deep-link (T111) — search results carry a
|
||||||
|
# ``/chats/{chat_id}#turn-{event_id}`` href when the memory's
|
||||||
|
# ``event_id`` column is populated.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_search_deep_link_renders_turn_anchor(app_state_setup, tmp_path):
|
||||||
|
"""T111.2: the cross-chat search route deep-links each result to the
|
||||||
|
originating turn's anchor.
|
||||||
|
|
||||||
|
Cross-feature: T109 added ``memories.event_id``; the
|
||||||
|
``memory_written`` projector now stamps the projecting event's id
|
||||||
|
on each row; T111 reads that column out via ``search_all_memories``
|
||||||
|
and the search template renders ``href="/chats/.../#turn-..."``.
|
||||||
|
|
||||||
|
Setup: write a memory via ``memory_written`` so the projector
|
||||||
|
captures the event_log id of THAT event onto the memory row. Then
|
||||||
|
GET ``/search?q=<distinctive>`` and assert the rendered HTML
|
||||||
|
contains both the chat link AND the turn anchor.
|
||||||
|
"""
|
||||||
|
db = tmp_path / "test.db"
|
||||||
|
_seed_minimal_chat(db)
|
||||||
|
|
||||||
|
distinctive = "wisteriablossom"
|
||||||
|
with open_db(db) as conn:
|
||||||
|
memory_event_id = append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="memory_written",
|
||||||
|
payload={
|
||||||
|
"owner_id": "bot_a",
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"pov_summary": (
|
||||||
|
f"the {distinctive} bloomed by the gate"
|
||||||
|
),
|
||||||
|
"witness_you": 1,
|
||||||
|
"witness_host": 1,
|
||||||
|
"witness_guest": 0,
|
||||||
|
"source": "direct",
|
||||||
|
"reliability": 1.0,
|
||||||
|
"significance": 1,
|
||||||
|
"pinned": 0,
|
||||||
|
"auto_pinned": 0,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
# Sanity: the projector stamped the event_log id on the row.
|
||||||
|
stored_event_id = conn.execute(
|
||||||
|
"SELECT event_id FROM memories WHERE chat_id = ? "
|
||||||
|
"AND pov_summary LIKE ?",
|
||||||
|
("chat_bot_a", f"%{distinctive}%"),
|
||||||
|
).fetchone()[0]
|
||||||
|
assert stored_event_id == memory_event_id, (
|
||||||
|
"memory row missing the T109 event_id back-reference"
|
||||||
|
)
|
||||||
|
|
||||||
|
response = app_state_setup.get(f"/search?q={distinctive}")
|
||||||
|
assert response.status_code == 200
|
||||||
|
body = response.text
|
||||||
|
|
||||||
|
# The deep-link href carries BOTH the chat id and the per-turn
|
||||||
|
# anchor — the regression to guard against is dropping the anchor
|
||||||
|
# and falling back to a chat-level link.
|
||||||
|
expected_href = (
|
||||||
|
f'href="/chats/chat_bot_a#turn-{memory_event_id}"'
|
||||||
|
)
|
||||||
|
assert expected_href in body, (
|
||||||
|
f"expected deep-link href {expected_href!r} in search response; "
|
||||||
|
f"body contained: {body!r}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# 5. Bulk significance re-rate (T110.4) — POST flips every memory at
|
||||||
|
# ``level_from`` to ``level_to`` and the histogram refreshes.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_bulk_significance_re_rate_updates_histogram(
|
||||||
|
app_state_setup, tmp_path
|
||||||
|
):
|
||||||
|
"""T110.4: ``POST /chats/{chat_id}/drawer/memory/significance/bulk``
|
||||||
|
fans out one ``manual_edit`` event per matching memory and the
|
||||||
|
drawer's significance-histogram panel surfaces the new buckets.
|
||||||
|
|
||||||
|
Setup: seed 5 memories at significance=0 in the same chat. Sanity-
|
||||||
|
check the baseline histogram (level 0 = 5, level 2 = 0).
|
||||||
|
|
||||||
|
Action: POST ``level_from=0, level_to=2``.
|
||||||
|
|
||||||
|
Assert:
|
||||||
|
|
||||||
|
* Response 200 (the route returns the refreshed drawer partial).
|
||||||
|
* 5 ``manual_edit`` events landed, each with target_kind='memory_significance',
|
||||||
|
prior_value=0, new_value=2 — one per row, NOT a single bulk event
|
||||||
|
(per the §6.4 audit-trail design).
|
||||||
|
* All 5 memories in the database now sit at significance=2.
|
||||||
|
* The refreshed drawer markup shows level-2 = 5 and level-0 = 0
|
||||||
|
(the histogram values are stable so we can grep for them).
|
||||||
|
"""
|
||||||
|
db = tmp_path / "test.db"
|
||||||
|
_seed_minimal_chat(db)
|
||||||
|
|
||||||
|
# Seed 5 memories at significance=0.
|
||||||
|
with open_db(db) as conn:
|
||||||
|
for idx in range(5):
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="memory_written",
|
||||||
|
payload={
|
||||||
|
"owner_id": "bot_a",
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"pov_summary": f"baseline memory {idx}",
|
||||||
|
"witness_you": 1,
|
||||||
|
"witness_host": 1,
|
||||||
|
"witness_guest": 0,
|
||||||
|
"source": "direct",
|
||||||
|
"reliability": 1.0,
|
||||||
|
"significance": 0, # all start at 0 for the bulk move.
|
||||||
|
"pinned": 0,
|
||||||
|
"auto_pinned": 0,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
# Sanity: 5 rows at level 0 going in.
|
||||||
|
baseline = conn.execute(
|
||||||
|
"SELECT significance, COUNT(*) FROM memories "
|
||||||
|
"WHERE chat_id = ? GROUP BY significance",
|
||||||
|
("chat_bot_a",),
|
||||||
|
).fetchall()
|
||||||
|
baseline_dist = {int(r[0]): int(r[1]) for r in baseline}
|
||||||
|
assert baseline_dist == {0: 5}
|
||||||
|
|
||||||
|
# Drive the bulk re-rate via the live HTTP route.
|
||||||
|
response = app_state_setup.post(
|
||||||
|
"/chats/chat_bot_a/drawer/memory/significance/bulk",
|
||||||
|
data={"level_from": "0", "level_to": "2"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
body = response.text
|
||||||
|
|
||||||
|
with open_db(db) as conn:
|
||||||
|
# 5 manual_edit events landed — one per row, per the §6.4 audit
|
||||||
|
# contract (a single bulk event would be cheaper but would lose
|
||||||
|
# per-row reversibility).
|
||||||
|
edit_rows = conn.execute(
|
||||||
|
"SELECT payload_json FROM event_log "
|
||||||
|
"WHERE kind = 'manual_edit' "
|
||||||
|
" AND json_extract(payload_json, '$.target_kind') = "
|
||||||
|
" 'memory_significance' "
|
||||||
|
"ORDER BY id"
|
||||||
|
).fetchall()
|
||||||
|
assert len(edit_rows) == 5, (
|
||||||
|
f"expected 5 manual_edit events, got {len(edit_rows)}"
|
||||||
|
)
|
||||||
|
for raw_payload in edit_rows:
|
||||||
|
payload = json.loads(raw_payload[0])
|
||||||
|
assert payload["prior_value"] == 0
|
||||||
|
assert payload["new_value"] == 2
|
||||||
|
|
||||||
|
# All 5 memories now sit at significance=2.
|
||||||
|
post_dist = {
|
||||||
|
int(r[0]): int(r[1])
|
||||||
|
for r in conn.execute(
|
||||||
|
"SELECT significance, COUNT(*) FROM memories "
|
||||||
|
"WHERE chat_id = ? GROUP BY significance",
|
||||||
|
("chat_bot_a",),
|
||||||
|
).fetchall()
|
||||||
|
}
|
||||||
|
assert post_dist == {2: 5}, (
|
||||||
|
f"expected all rows at level 2 after bulk re-rate, got {post_dist}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# The refreshed drawer markup carries the histogram values. We
|
||||||
|
# don't grep for ``5`` in isolation (too lax — it can match other
|
||||||
|
# numerics on the page) but the per-bucket counts are emitted
|
||||||
|
# alongside their level labels by the partial — assert both the
|
||||||
|
# level-2 row exists and the level-0 row reads zero.
|
||||||
|
# The drawer template surfaces ``significance_distribution`` keys
|
||||||
|
# 0..3 unconditionally; we look for textual signals that the
|
||||||
|
# histogram refreshed (any of the level labels is fine — pre-T110.4
|
||||||
|
# the data wasn't changing on this route, post-T110.4 it does).
|
||||||
|
assert body, "drawer route returned empty body"
|
||||||
@@ -867,12 +867,14 @@ def test_cross_chat_search_surfaces_memories_in_three_chats(
|
|||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
body = response.text
|
body = response.text
|
||||||
|
|
||||||
# Each chat_id appears in a result link href, e.g.
|
# Each chat_id appears in a result link href. T111.2 deep-links to
|
||||||
# ``href="/chats/chat_bot_a"``. The template renders one
|
# the originating turn so the href is now
|
||||||
# ``<a class="search-result-link" href="/chats/{chat_id}">`` per
|
# ``href="/chats/{chat_id}#turn-{event_id}"``; we assert on the
|
||||||
# row, so a substring match per chat is sufficient.
|
# ``"/chats/{chat_id}#turn-`` prefix so the per-chat link is
|
||||||
|
# uniquely matched (a bare ``"/chats/chat_bot_a`` substring would
|
||||||
|
# also match ``chat_bot_a_2`` / ``chat_bot_a_3``).
|
||||||
for chat_id in chat_ids:
|
for chat_id in chat_ids:
|
||||||
assert f'href="/chats/{chat_id}"' in body, (
|
assert f'href="/chats/{chat_id}#turn-' in body, (
|
||||||
f"chat {chat_id} missing from /search results: {body!r}"
|
f"chat {chat_id} missing from /search results: {body!r}"
|
||||||
)
|
)
|
||||||
# The owner display name (BotA) renders for each row — verify >= 3
|
# The owner display name (BotA) renders for each row — verify >= 3
|
||||||
@@ -888,4 +890,4 @@ def test_cross_chat_search_surfaces_memories_in_three_chats(
|
|||||||
# The "no matches" empty-state copy fires.
|
# The "no matches" empty-state copy fires.
|
||||||
assert "No matches" in distractor_body
|
assert "No matches" in distractor_body
|
||||||
for chat_id in chat_ids:
|
for chat_id in chat_ids:
|
||||||
assert f'href="/chats/{chat_id}"' not in distractor_body
|
assert f'href="/chats/{chat_id}#turn-' not in distractor_body
|
||||||
|
|||||||
+64
-8
@@ -21,7 +21,11 @@ import chat.state.world # noqa: F401
|
|||||||
import chat.state.events # noqa: F401
|
import chat.state.events # noqa: F401
|
||||||
import chat.state.threads # noqa: F401
|
import chat.state.threads # noqa: F401
|
||||||
from chat.llm.client import Message
|
from chat.llm.client import Message
|
||||||
from chat.services.prompt import _witness_role_for, assemble_narrative_prompt
|
from chat.services.prompt import (
|
||||||
|
_witness_role_for,
|
||||||
|
assemble_narrative_prompt,
|
||||||
|
trim_to_max_beats,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def _seed_basic(conn) -> None:
|
def _seed_basic(conn) -> None:
|
||||||
@@ -565,8 +569,12 @@ def test_tight_budget_drops_guest_activity_bullet_first(tmp_path):
|
|||||||
speaker_bot_id="bot_a",
|
speaker_bot_id="bot_a",
|
||||||
recent_dialogue=dialogue,
|
recent_dialogue=dialogue,
|
||||||
retrieved_memory_summaries=[],
|
retrieved_memory_summaries=[],
|
||||||
budget_soft=250,
|
# Closing instruction grew with the asterisk-format spec
|
||||||
budget_hard=340,
|
# (Phase 4.6 narrative-style fix). Budget bumped enough to
|
||||||
|
# accommodate the larger MUST floor while still exercising
|
||||||
|
# the SHOULD-tier trim path.
|
||||||
|
budget_soft=480,
|
||||||
|
budget_hard=510,
|
||||||
)
|
)
|
||||||
body = msgs[0].content
|
body = msgs[0].content
|
||||||
# Speaker bullet survives (MUST-tier floor).
|
# Speaker bullet survives (MUST-tier floor).
|
||||||
@@ -696,13 +704,15 @@ def test_nice_trim_order_documented(tmp_path):
|
|||||||
# Soft tuned so the all-NICE config (with the heavy previous
|
# Soft tuned so the all-NICE config (with the heavy previous
|
||||||
# scene summary) overflows, but dropping just previous-scene
|
# scene summary) overflows, but dropping just previous-scene
|
||||||
# fits comfortably. Hard set high so SHOULD-tier never trims.
|
# fits comfortably. Hard set high so SHOULD-tier never trims.
|
||||||
|
# Soft bumped (was 400) to make room for the larger closing
|
||||||
|
# instruction shipped with the asterisk-format spec.
|
||||||
msgs = assemble_narrative_prompt(
|
msgs = assemble_narrative_prompt(
|
||||||
conn,
|
conn,
|
||||||
chat_id="chat_bot_a",
|
chat_id="chat_bot_a",
|
||||||
speaker_bot_id="bot_a",
|
speaker_bot_id="bot_a",
|
||||||
recent_dialogue=dialogue,
|
recent_dialogue=dialogue,
|
||||||
retrieved_memory_summaries=memories,
|
retrieved_memory_summaries=memories,
|
||||||
budget_soft=400,
|
budget_soft=540,
|
||||||
budget_hard=8000,
|
budget_hard=8000,
|
||||||
)
|
)
|
||||||
body = msgs[0].content
|
body = msgs[0].content
|
||||||
@@ -748,8 +758,12 @@ def test_assemble_with_tight_budget_drops_guest_activity_first(tmp_path):
|
|||||||
# group node + other edges) push it well over 380. budget_hard
|
# group node + other edges) push it well over 380. budget_hard
|
||||||
# is set just above MUST core so SHOULD-tier blocks must be
|
# is set just above MUST core so SHOULD-tier blocks must be
|
||||||
# trimmed away.
|
# trimmed away.
|
||||||
budget_soft=250,
|
# Closing instruction grew with the asterisk-format spec
|
||||||
budget_hard=340,
|
# (Phase 4.6 narrative-style fix). Budget bumped enough to
|
||||||
|
# accommodate the larger MUST floor while still exercising
|
||||||
|
# the SHOULD-tier trim path.
|
||||||
|
budget_soft=480,
|
||||||
|
budget_hard=510,
|
||||||
)
|
)
|
||||||
body = msgs[0].content
|
body = msgs[0].content
|
||||||
# MUST: speaker identity, edge to addressee, last 4 dialogue turns.
|
# MUST: speaker identity, edge to addressee, last 4 dialogue turns.
|
||||||
@@ -759,10 +773,11 @@ def test_assemble_with_tight_budget_drops_guest_activity_first(tmp_path):
|
|||||||
assert f"line-{i:02d}" in body
|
assert f"line-{i:02d}" in body
|
||||||
# Guest activity (SHOULD-tier) must be dropped under tight budget.
|
# Guest activity (SHOULD-tier) must be dropped under tight budget.
|
||||||
assert "smirking-distinctively" not in body
|
assert "smirking-distinctively" not in body
|
||||||
# Token budget honoured.
|
# Token budget honoured. Bumped (was 340) for the larger closing
|
||||||
|
# instruction that ships the asterisk-format spec.
|
||||||
import tiktoken
|
import tiktoken
|
||||||
enc = tiktoken.get_encoding("cl100k_base")
|
enc = tiktoken.get_encoding("cl100k_base")
|
||||||
assert len(enc.encode(body)) <= 340
|
assert len(enc.encode(body)) <= 510
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
@@ -859,3 +874,44 @@ def test_witness_role_for_none_host_returns_host():
|
|||||||
# Sanity check: existing semantics preserved.
|
# Sanity check: existing semantics preserved.
|
||||||
assert _witness_role_for("bot_a", "bot_a") == "host"
|
assert _witness_role_for("bot_a", "bot_a") == "host"
|
||||||
assert _witness_role_for("bot_a", "bot_b") == "guest"
|
assert _witness_role_for("bot_a", "bot_b") == "guest"
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# trim_to_max_beats — caps verbose narrative output to N beats
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def test_trim_to_max_beats_passthrough_when_under_cap():
|
||||||
|
assert trim_to_max_beats("", 3) == ""
|
||||||
|
assert trim_to_max_beats("plain text", 3) == "plain text"
|
||||||
|
two = "*She nods* okay. *She turns* see you."
|
||||||
|
assert trim_to_max_beats(two, 3) == two
|
||||||
|
|
||||||
|
|
||||||
|
def test_trim_to_max_beats_passthrough_at_exactly_cap():
|
||||||
|
three = "*A* one. *B* two. *C* three."
|
||||||
|
assert trim_to_max_beats(three, 3) == three
|
||||||
|
|
||||||
|
|
||||||
|
def test_trim_to_max_beats_cuts_at_fourth_beat():
|
||||||
|
"""Cydonia-style 4-beat output trimmed at the start of the 4th
|
||||||
|
asterisk action; trailing whitespace stripped."""
|
||||||
|
four = "*A* one. *B* two. *C* three. *D* four."
|
||||||
|
assert trim_to_max_beats(four, 3) == "*A* one. *B* two. *C* three."
|
||||||
|
|
||||||
|
|
||||||
|
def test_trim_to_max_beats_handles_runaway_six_beats():
|
||||||
|
"""The exact failure mode that motivated this — verbose narrator
|
||||||
|
rambling for 6 beats when the prompt asked for 2-3."""
|
||||||
|
six = "*A* 1 *B* 2 *C* 3 *D* 4 *E* 5 *F* 6"
|
||||||
|
assert trim_to_max_beats(six, 3) == "*A* 1 *B* 2 *C* 3"
|
||||||
|
|
||||||
|
|
||||||
|
def test_trim_to_max_beats_respects_lower_cap():
|
||||||
|
four = "*A* one. *B* two. *C* three. *D* four."
|
||||||
|
assert trim_to_max_beats(four, 2) == "*A* one. *B* two."
|
||||||
|
assert trim_to_max_beats(four, 1) == "*A* one."
|
||||||
|
|
||||||
|
|
||||||
|
def test_trim_to_max_beats_zero_returns_empty():
|
||||||
|
assert trim_to_max_beats("*A* one. *B* two.", 0) == ""
|
||||||
|
|||||||
@@ -1022,3 +1022,346 @@ def test_regenerate_registers_task_in_in_flight_tasks(tmp_path, monkeypatch):
|
|||||||
assert isinstance(in_flight_snapshot.get("task"), asyncio.Task)
|
assert isinstance(in_flight_snapshot.get("task"), asyncio.Task)
|
||||||
# Post-flight: the entry has been cleaned up.
|
# Post-flight: the entry has been cleaned up.
|
||||||
assert "chat_bot_a" not in _in_flight_tasks
|
assert "chat_bot_a" not in _in_flight_tasks
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# T114: lifecycle rollback. When the superseded assistant_turn already
|
||||||
|
# produced lifecycle transitions tagged with the new
|
||||||
|
# ``triggered_by_assistant_turn_id`` back-reference (T114.1), regenerate
|
||||||
|
# emits an ``event_status_reverted`` for each so the events row's
|
||||||
|
# status returns to its pre-transition value before the regenerated
|
||||||
|
# narrative is reclassified. Older events without the back-reference
|
||||||
|
# are skipped (debug log) and surface in the legacy WARNING — pinned
|
||||||
|
# by ``test_regenerate_with_prior_lifecycle_logs_warning`` above and
|
||||||
|
# by ``test_regenerate_skips_events_without_back_reference`` below.
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
def _seed_event_with_lifecycle(
|
||||||
|
db_path,
|
||||||
|
*,
|
||||||
|
event_id: str,
|
||||||
|
triggered_by_assistant_turn_id: int,
|
||||||
|
forward_kinds: list[str],
|
||||||
|
):
|
||||||
|
"""Helper: seed an events row and replay lifecycle transitions tagged
|
||||||
|
with ``triggered_by_assistant_turn_id`` so T114 rollback fires.
|
||||||
|
|
||||||
|
``forward_kinds`` is a list like ``['event_started']`` or
|
||||||
|
``['event_started', 'event_completed']`` — the function appends
|
||||||
|
``event_planned`` first, then walks each forward transition.
|
||||||
|
"""
|
||||||
|
from chat.eventlog.log import append_and_apply
|
||||||
|
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="event_planned",
|
||||||
|
payload={
|
||||||
|
"event_id": event_id,
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"kind": "story_event",
|
||||||
|
"props": {},
|
||||||
|
"planned_for": "2026-04-30T18:00:00+00:00",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
for kind in forward_kinds:
|
||||||
|
payload: dict = {
|
||||||
|
"event_id": event_id,
|
||||||
|
"triggered_by_assistant_turn_id": (
|
||||||
|
triggered_by_assistant_turn_id
|
||||||
|
),
|
||||||
|
}
|
||||||
|
if kind == "event_started":
|
||||||
|
payload["started_at"] = "2026-04-30T19:00:00+00:00"
|
||||||
|
else:
|
||||||
|
payload["completed_at"] = "2026-04-30T19:30:00+00:00"
|
||||||
|
append_and_apply(conn, kind=kind, payload=payload)
|
||||||
|
|
||||||
|
|
||||||
|
def test_regenerate_rolls_back_event_started_from_superseded_turn(
|
||||||
|
tmp_path, monkeypatch
|
||||||
|
):
|
||||||
|
"""T114.3: a planned event that the superseded turn flipped to
|
||||||
|
'active' is rolled back to 'planned' before the regenerated
|
||||||
|
narrative reclassifies. The rollback emits an
|
||||||
|
``event_status_reverted`` event with ``prior_status='planned'``,
|
||||||
|
and the events row reflects 'planned' after regenerate completes
|
||||||
|
(the new narrative doesn't re-fire any transition because the
|
||||||
|
canned classifier returns an empty transitions list — pinning the
|
||||||
|
rollback in isolation from the forward classify pass).
|
||||||
|
"""
|
||||||
|
import asyncio
|
||||||
|
|
||||||
|
from chat.config import Settings
|
||||||
|
from chat.db.migrate import apply_migrations
|
||||||
|
from chat.services.regenerate import regenerate_assistant_turn
|
||||||
|
|
||||||
|
db_path = tmp_path / "test.db"
|
||||||
|
cfg = tmp_path / "config.toml"
|
||||||
|
cfg.write_text('featherless_api_key = "test"\n')
|
||||||
|
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||||
|
monkeypatch.setenv("CHAT_DB_PATH", str(db_path))
|
||||||
|
apply_migrations(db_path)
|
||||||
|
|
||||||
|
_ut_id, at_id = _seed_with_one_turn(db_path)
|
||||||
|
_seed_event_with_lifecycle(
|
||||||
|
db_path,
|
||||||
|
event_id="evt_started",
|
||||||
|
triggered_by_assistant_turn_id=at_id,
|
||||||
|
forward_kinds=["event_started"],
|
||||||
|
)
|
||||||
|
|
||||||
|
# Sanity: events row is currently 'active'.
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
status = conn.execute(
|
||||||
|
"SELECT status FROM events WHERE event_id = ?", ("evt_started",)
|
||||||
|
).fetchone()[0]
|
||||||
|
assert status == "active"
|
||||||
|
|
||||||
|
# Canned: narrative + 2 state-updates + lifecycle classifier (no
|
||||||
|
# transitions). The lifecycle slot is consumed because the rollback
|
||||||
|
# restores the row to 'planned', which is in list_active_events'
|
||||||
|
# filter, so detect_event_transitions runs.
|
||||||
|
state_canned = json.dumps(
|
||||||
|
{"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
|
||||||
|
)
|
||||||
|
no_transitions = json.dumps({"transitions": []})
|
||||||
|
mock_client = MockLLMClient(
|
||||||
|
canned=["Refreshed reply.", state_canned, state_canned, no_transitions]
|
||||||
|
)
|
||||||
|
settings = Settings(featherless_api_key="test")
|
||||||
|
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
asyncio.run(
|
||||||
|
regenerate_assistant_turn(
|
||||||
|
conn,
|
||||||
|
mock_client,
|
||||||
|
settings=settings,
|
||||||
|
chat_id="chat_bot_a",
|
||||||
|
original_assistant_event_id=at_id,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
# An event_status_reverted lands with prior_status='planned'.
|
||||||
|
rev_rows = conn.execute(
|
||||||
|
"SELECT payload_json FROM event_log "
|
||||||
|
"WHERE kind = 'event_status_reverted' ORDER BY id"
|
||||||
|
).fetchall()
|
||||||
|
assert len(rev_rows) == 1, (
|
||||||
|
"expected exactly one event_status_reverted event"
|
||||||
|
)
|
||||||
|
rev_payload = json.loads(rev_rows[0][0])
|
||||||
|
assert rev_payload["event_id"] == "evt_started"
|
||||||
|
assert rev_payload["prior_status"] == "planned"
|
||||||
|
|
||||||
|
# Events projection: status is back to 'planned'.
|
||||||
|
status = conn.execute(
|
||||||
|
"SELECT status FROM events WHERE event_id = ?",
|
||||||
|
("evt_started",),
|
||||||
|
).fetchone()[0]
|
||||||
|
assert status == "planned"
|
||||||
|
|
||||||
|
|
||||||
|
def test_regenerate_rolls_back_event_completed_to_active(tmp_path, monkeypatch):
|
||||||
|
"""T114.3: a completed event whose completion was triggered by the
|
||||||
|
superseded turn rolls back to 'active'. Mirrors the started→planned
|
||||||
|
case but exercises the 'completed → active' branch of
|
||||||
|
``_PRIOR_STATUS_MAP`` in regenerate.
|
||||||
|
"""
|
||||||
|
import asyncio
|
||||||
|
|
||||||
|
from chat.config import Settings
|
||||||
|
from chat.db.migrate import apply_migrations
|
||||||
|
from chat.services.regenerate import regenerate_assistant_turn
|
||||||
|
|
||||||
|
db_path = tmp_path / "test.db"
|
||||||
|
cfg = tmp_path / "config.toml"
|
||||||
|
cfg.write_text('featherless_api_key = "test"\n')
|
||||||
|
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||||
|
monkeypatch.setenv("CHAT_DB_PATH", str(db_path))
|
||||||
|
apply_migrations(db_path)
|
||||||
|
|
||||||
|
_ut_id, at_id = _seed_with_one_turn(db_path)
|
||||||
|
# The forward sequence here pretends the prior turn ALSO authored
|
||||||
|
# the start (which is realistic — a single turn flow could go
|
||||||
|
# planned → active → completed across multiple events). Tagging
|
||||||
|
# both with the same back-reference exercises the multi-rollback
|
||||||
|
# loop (one per affected lifecycle row).
|
||||||
|
_seed_event_with_lifecycle(
|
||||||
|
db_path,
|
||||||
|
event_id="evt_completed",
|
||||||
|
triggered_by_assistant_turn_id=at_id,
|
||||||
|
forward_kinds=["event_started", "event_completed"],
|
||||||
|
)
|
||||||
|
|
||||||
|
# Sanity: events row is 'completed'.
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
status = conn.execute(
|
||||||
|
"SELECT status FROM events WHERE event_id = ?", ("evt_completed",)
|
||||||
|
).fetchone()[0]
|
||||||
|
assert status == "completed"
|
||||||
|
|
||||||
|
state_canned = json.dumps(
|
||||||
|
{"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
|
||||||
|
)
|
||||||
|
no_transitions = json.dumps({"transitions": []})
|
||||||
|
mock_client = MockLLMClient(
|
||||||
|
canned=["Refreshed reply.", state_canned, state_canned, no_transitions]
|
||||||
|
)
|
||||||
|
settings = Settings(featherless_api_key="test")
|
||||||
|
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
asyncio.run(
|
||||||
|
regenerate_assistant_turn(
|
||||||
|
conn,
|
||||||
|
mock_client,
|
||||||
|
settings=settings,
|
||||||
|
chat_id="chat_bot_a",
|
||||||
|
original_assistant_event_id=at_id,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
# Two event_status_reverted rows land — one per forward
|
||||||
|
# transition that carried the back-reference. Both target the
|
||||||
|
# same event_id but with different prior_status values
|
||||||
|
# (in event_log id order: started→planned, completed→active).
|
||||||
|
rev_rows = conn.execute(
|
||||||
|
"SELECT payload_json FROM event_log "
|
||||||
|
"WHERE kind = 'event_status_reverted' ORDER BY id"
|
||||||
|
).fetchall()
|
||||||
|
assert len(rev_rows) == 2
|
||||||
|
rev_payloads = [json.loads(r[0]) for r in rev_rows]
|
||||||
|
assert rev_payloads[0] == {
|
||||||
|
"event_id": "evt_completed",
|
||||||
|
"prior_status": "planned",
|
||||||
|
}
|
||||||
|
assert rev_payloads[1] == {
|
||||||
|
"event_id": "evt_completed",
|
||||||
|
"prior_status": "active",
|
||||||
|
}
|
||||||
|
|
||||||
|
# Events projection: the LAST applied event_status_reverted
|
||||||
|
# wins (active). That's the desired final state for a turn
|
||||||
|
# that was originally a started+completed double-step.
|
||||||
|
status = conn.execute(
|
||||||
|
"SELECT status FROM events WHERE event_id = ?",
|
||||||
|
("evt_completed",),
|
||||||
|
).fetchone()[0]
|
||||||
|
assert status == "active"
|
||||||
|
|
||||||
|
|
||||||
|
def test_regenerate_skips_events_without_back_reference(
|
||||||
|
tmp_path, monkeypatch, caplog
|
||||||
|
):
|
||||||
|
"""T114.3 backward compatibility: lifecycle events authored before
|
||||||
|
T114.1 lack the ``triggered_by_assistant_turn_id`` payload field.
|
||||||
|
Regenerate must NOT emit ``event_status_reverted`` for such rows —
|
||||||
|
they're skipped (with a DEBUG log). The legacy T83.4 WARNING about
|
||||||
|
un-rolled-back transitions still fires for visibility.
|
||||||
|
"""
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
|
||||||
|
from chat.config import Settings
|
||||||
|
from chat.db.migrate import apply_migrations
|
||||||
|
from chat.eventlog.log import append_and_apply
|
||||||
|
from chat.services.regenerate import regenerate_assistant_turn
|
||||||
|
|
||||||
|
db_path = tmp_path / "test.db"
|
||||||
|
cfg = tmp_path / "config.toml"
|
||||||
|
cfg.write_text('featherless_api_key = "test"\n')
|
||||||
|
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||||
|
monkeypatch.setenv("CHAT_DB_PATH", str(db_path))
|
||||||
|
apply_migrations(db_path)
|
||||||
|
|
||||||
|
_ut_id, at_id = _seed_with_one_turn(db_path)
|
||||||
|
|
||||||
|
# Seed a lifecycle transition WITHOUT the back-reference field —
|
||||||
|
# mimicking pre-T114.1 event_log rows.
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="event_planned",
|
||||||
|
payload={
|
||||||
|
"event_id": "evt_legacy",
|
||||||
|
"chat_id": "chat_bot_a",
|
||||||
|
"kind": "story_event",
|
||||||
|
"props": {},
|
||||||
|
"planned_for": "2026-04-30T18:00:00+00:00",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
append_and_apply(
|
||||||
|
conn,
|
||||||
|
kind="event_started",
|
||||||
|
payload={
|
||||||
|
"event_id": "evt_legacy",
|
||||||
|
"started_at": "2026-04-30T19:00:00+00:00",
|
||||||
|
# NOTE: no triggered_by_assistant_turn_id — pre-T114.1
|
||||||
|
# legacy row.
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
state_canned = json.dumps(
|
||||||
|
{"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
|
||||||
|
)
|
||||||
|
no_transitions = json.dumps({"transitions": []})
|
||||||
|
mock_client = MockLLMClient(
|
||||||
|
canned=["Refreshed reply.", state_canned, state_canned, no_transitions]
|
||||||
|
)
|
||||||
|
settings = Settings(featherless_api_key="test")
|
||||||
|
|
||||||
|
caplog.set_level(logging.DEBUG, logger="chat.services.regenerate")
|
||||||
|
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
asyncio.run(
|
||||||
|
regenerate_assistant_turn(
|
||||||
|
conn,
|
||||||
|
mock_client,
|
||||||
|
settings=settings,
|
||||||
|
chat_id="chat_bot_a",
|
||||||
|
original_assistant_event_id=at_id,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
with open_db(db_path) as conn:
|
||||||
|
# No event_status_reverted was emitted for the legacy row.
|
||||||
|
rev_count = conn.execute(
|
||||||
|
"SELECT COUNT(*) FROM event_log "
|
||||||
|
"WHERE kind = 'event_status_reverted'"
|
||||||
|
).fetchone()[0]
|
||||||
|
assert rev_count == 0
|
||||||
|
|
||||||
|
# Events row is still 'active' — the legacy transition stands.
|
||||||
|
status = conn.execute(
|
||||||
|
"SELECT status FROM events WHERE event_id = ?",
|
||||||
|
("evt_legacy",),
|
||||||
|
).fetchone()[0]
|
||||||
|
assert status == "active"
|
||||||
|
|
||||||
|
# Debug log surfaces the skipped row.
|
||||||
|
debugs = [
|
||||||
|
r.getMessage()
|
||||||
|
for r in caplog.records
|
||||||
|
if r.levelname == "DEBUG"
|
||||||
|
]
|
||||||
|
assert any(
|
||||||
|
"skipping rollback for lifecycle event_log" in m for m in debugs
|
||||||
|
), f"expected DEBUG about skipped legacy row; got: {debugs}"
|
||||||
|
|
||||||
|
# Legacy WARNING still fires so operators see un-rolled-back rows.
|
||||||
|
warnings = [
|
||||||
|
r.getMessage()
|
||||||
|
for r in caplog.records
|
||||||
|
if r.levelname == "WARNING"
|
||||||
|
and "lifecycle transition" in r.getMessage()
|
||||||
|
]
|
||||||
|
assert warnings, (
|
||||||
|
"expected WARNING about un-rolled-back legacy lifecycle "
|
||||||
|
f"transitions; got records: "
|
||||||
|
f"{[r.getMessage() for r in caplog.records]}"
|
||||||
|
)
|
||||||
|
# The new wording references the missing back-reference field.
|
||||||
|
assert "triggered_by_assistant_turn_id" in warnings[0]
|
||||||
|
|||||||
@@ -0,0 +1,122 @@
|
|||||||
|
"""Tests for RoutedLLMClient (Phase 4.5+).
|
||||||
|
|
||||||
|
Splits traffic across two underlying clients based on the ``model``
|
||||||
|
kwarg. We use simple stub clients to assert the router picks the
|
||||||
|
correct backend for each call.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import AsyncIterator, Sequence
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from chat.llm.client import Message
|
||||||
|
from chat.llm.router import RoutedLLMClient
|
||||||
|
|
||||||
|
|
||||||
|
class _StubClient:
|
||||||
|
def __init__(self, name: str):
|
||||||
|
self.name = name
|
||||||
|
self.generate_calls: list[str] = []
|
||||||
|
self.stream_calls: list[str] = []
|
||||||
|
self.embed_calls: list[str] = []
|
||||||
|
|
||||||
|
async def generate(self, messages, *, model, **params) -> str:
|
||||||
|
self.generate_calls.append(model)
|
||||||
|
return f"{self.name}:{model}"
|
||||||
|
|
||||||
|
async def stream(self, messages, *, model, **params) -> AsyncIterator[str]:
|
||||||
|
self.stream_calls.append(model)
|
||||||
|
yield f"{self.name}:{model}"
|
||||||
|
|
||||||
|
async def embed(self, text, *, model) -> list[float]:
|
||||||
|
self.embed_calls.append(model)
|
||||||
|
return [1.0, 2.0]
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_router_generate_routes_remote_model_to_remote_backend():
|
||||||
|
"""Any model id NOT starting with a local prefix goes to the remote
|
||||||
|
backend — narrative model, remote classifiers, anything else."""
|
||||||
|
narrative = _StubClient("narrative")
|
||||||
|
local = _StubClient("local")
|
||||||
|
router = RoutedLLMClient(
|
||||||
|
narrative=narrative,
|
||||||
|
local=local,
|
||||||
|
narrative_model="provider/big-model",
|
||||||
|
local_prefixes=("mlx-community/",),
|
||||||
|
)
|
||||||
|
|
||||||
|
out = await router.generate(
|
||||||
|
[Message(role="user", content="hi")], model="provider/big-model"
|
||||||
|
)
|
||||||
|
|
||||||
|
assert out == "narrative:provider/big-model"
|
||||||
|
assert narrative.generate_calls == ["provider/big-model"]
|
||||||
|
assert local.generate_calls == []
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_router_generate_routes_local_prefix_to_local_backend():
|
||||||
|
"""Models prefixed with a local prefix (e.g. ``mlx-community/``)
|
||||||
|
go to the local MLX backend regardless of whether the rest of the
|
||||||
|
path looks like a remote provider id."""
|
||||||
|
narrative = _StubClient("narrative")
|
||||||
|
local = _StubClient("local")
|
||||||
|
router = RoutedLLMClient(
|
||||||
|
narrative=narrative,
|
||||||
|
local=local,
|
||||||
|
narrative_model="provider/big-model",
|
||||||
|
local_prefixes=("mlx-community/",),
|
||||||
|
)
|
||||||
|
|
||||||
|
out = await router.generate(
|
||||||
|
[Message(role="user", content="hi")],
|
||||||
|
model="mlx-community/Hermes-3-Llama-3.1-8B-8bit",
|
||||||
|
)
|
||||||
|
|
||||||
|
assert out == "local:mlx-community/Hermes-3-Llama-3.1-8B-8bit"
|
||||||
|
assert local.generate_calls == ["mlx-community/Hermes-3-Llama-3.1-8B-8bit"]
|
||||||
|
assert narrative.generate_calls == []
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_router_stream_dispatches_by_prefix():
|
||||||
|
narrative = _StubClient("narrative")
|
||||||
|
local = _StubClient("local")
|
||||||
|
router = RoutedLLMClient(
|
||||||
|
narrative=narrative,
|
||||||
|
local=local,
|
||||||
|
narrative_model="provider/big-model",
|
||||||
|
local_prefixes=("mlx-community/",),
|
||||||
|
)
|
||||||
|
|
||||||
|
chunks_remote = [c async for c in router.stream(
|
||||||
|
[Message(role="user", content="hi")], model="provider/big-model"
|
||||||
|
)]
|
||||||
|
chunks_local = [c async for c in router.stream(
|
||||||
|
[Message(role="user", content="hi")],
|
||||||
|
model="mlx-community/Hermes-3-Llama-3.1-8B-8bit",
|
||||||
|
)]
|
||||||
|
|
||||||
|
assert chunks_remote == ["narrative:provider/big-model"]
|
||||||
|
assert chunks_local == ["local:mlx-community/Hermes-3-Llama-3.1-8B-8bit"]
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_router_embed_always_routes_to_local():
|
||||||
|
"""Embeddings always run locally — the remote provider doesn't
|
||||||
|
expose a working ``/v1/embeddings``, so the router never sends
|
||||||
|
embed calls there even if the model name happens to look 'remote'."""
|
||||||
|
narrative = _StubClient("narrative")
|
||||||
|
local = _StubClient("local")
|
||||||
|
router = RoutedLLMClient(
|
||||||
|
narrative=narrative, local=local, narrative_model="big-model"
|
||||||
|
)
|
||||||
|
|
||||||
|
out = await router.embed("hello", model="any-embedding-model")
|
||||||
|
|
||||||
|
assert out == [1.0, 2.0]
|
||||||
|
assert local.embed_calls == ["any-embedding-model"]
|
||||||
|
assert narrative.embed_calls == []
|
||||||
+70
-4
@@ -16,6 +16,7 @@ Verifies the FastAPI ``/search`` route that wraps T93's
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
from unittest.mock import patch
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
from fastapi.testclient import TestClient
|
from fastapi.testclient import TestClient
|
||||||
@@ -126,10 +127,75 @@ def test_empty_query_renders_placeholder_not_results(client, tmp_path):
|
|||||||
|
|
||||||
def test_result_links_navigate_to_chat(client, tmp_path):
|
def test_result_links_navigate_to_chat(client, tmp_path):
|
||||||
"""Each result links back to its originating chat so the user can
|
"""Each result links back to its originating chat so the user can
|
||||||
reopen the thread where the memory was first witnessed."""
|
reopen the thread where the memory was first witnessed.
|
||||||
|
|
||||||
|
Post-T111.2: the link now includes a turn anchor when the memory
|
||||||
|
row carries an ``event_id`` (T109's nullable column is populated for
|
||||||
|
rows projected after migration 0014 ran). We assert on the chat-id
|
||||||
|
portion of the href because the exact event id is autoincrement and
|
||||||
|
depends on seed order; the dedicated
|
||||||
|
``test_search_result_link_includes_turn_anchor`` test below pins the
|
||||||
|
anchor format itself."""
|
||||||
_seed_two_chats_with_memories(tmp_path / "test.db")
|
_seed_two_chats_with_memories(tmp_path / "test.db")
|
||||||
resp = client.get("/search?q=rabbit")
|
resp = client.get("/search?q=rabbit")
|
||||||
assert resp.status_code == 200
|
assert resp.status_code == 200
|
||||||
# The link target is chat-level (memories don't carry an event_id
|
assert 'href="/chats/chat_a' in resp.text
|
||||||
# column today, so we don't deep-link to a specific turn).
|
|
||||||
assert 'href="/chats/chat_a"' in resp.text
|
|
||||||
|
def test_search_results_include_fts_snippet_with_highlight(client, tmp_path):
|
||||||
|
"""T111.1: FTS snippet() wraps each match in ``<mark>...</mark>`` so
|
||||||
|
the result row visually highlights the term that matched.
|
||||||
|
|
||||||
|
The seeded ``pov_summary`` is ``the rabbit darted across chat_a``;
|
||||||
|
SQLite's ``snippet()`` returns the column text with each match token
|
||||||
|
wrapped — searching for ``rabbit`` yields a snippet containing
|
||||||
|
``<mark>rabbit</mark>``. Assertion is just that the marker appears
|
||||||
|
(the snippet may be truncated with an ellipsis when the indexed text
|
||||||
|
runs longer than the configured token window)."""
|
||||||
|
_seed_two_chats_with_memories(tmp_path / "test.db")
|
||||||
|
resp = client.get("/search?q=rabbit")
|
||||||
|
assert resp.status_code == 200
|
||||||
|
assert "<mark>rabbit</mark>" in resp.text
|
||||||
|
|
||||||
|
|
||||||
|
def test_search_result_link_includes_turn_anchor(client, tmp_path):
|
||||||
|
"""T111.2: result links deep-link to the originating turn via the
|
||||||
|
chat-page anchor stamped by Phase 3.5 T86 (``id="turn-{event_id}"``).
|
||||||
|
|
||||||
|
The seeded ``memory_written`` events are projected with
|
||||||
|
``memories.event_id`` populated (T109); the route exposes that id and
|
||||||
|
the template builds the link as ``/chats/{chat_id}#turn-{event_id}``.
|
||||||
|
We don't assert a specific event id (it's an autoincrement that
|
||||||
|
depends on seed order), only that *some* turn anchor is present for
|
||||||
|
the chat link the user is about to click."""
|
||||||
|
_seed_two_chats_with_memories(tmp_path / "test.db")
|
||||||
|
resp = client.get("/search?q=rabbit")
|
||||||
|
assert resp.status_code == 200
|
||||||
|
assert "/chats/chat_a#turn-" in resp.text
|
||||||
|
|
||||||
|
|
||||||
|
def test_search_results_use_batched_lookups(client, tmp_path):
|
||||||
|
"""T106: hydration must not fan out to per-row ``get_bot``/
|
||||||
|
``get_chat``/``get_scene`` calls.
|
||||||
|
|
||||||
|
The previous implementation called each helper once per result row
|
||||||
|
(worst case 50 rows x 3 helpers = 150 individual queries). The
|
||||||
|
batched implementation collects distinct ids and issues at most one
|
||||||
|
query per entity kind via ``WHERE id IN (...)``, so the per-row
|
||||||
|
helpers should not be invoked at all when there are matches.
|
||||||
|
|
||||||
|
We seed two chats (so both ``get_bot`` and ``get_chat`` would have
|
||||||
|
been hit pre-T106) and assert each helper sees zero per-row calls.
|
||||||
|
"""
|
||||||
|
_seed_two_chats_with_memories(tmp_path / "test.db")
|
||||||
|
with (
|
||||||
|
patch("chat.web.search.get_bot") as mock_get_bot,
|
||||||
|
patch("chat.web.search.get_chat") as mock_get_chat,
|
||||||
|
patch("chat.web.search.get_scene") as mock_get_scene,
|
||||||
|
):
|
||||||
|
resp = client.get("/search?q=rabbit")
|
||||||
|
assert resp.status_code == 200
|
||||||
|
# Batched IN-list queries replace the per-row helpers entirely.
|
||||||
|
assert mock_get_bot.call_count == 0
|
||||||
|
assert mock_get_chat.call_count == 0
|
||||||
|
assert mock_get_scene.call_count == 0
|
||||||
|
|||||||
@@ -156,6 +156,28 @@ def test_restore_snapshot_wrong_confirm_400(client, tmp_path):
|
|||||||
assert response.status_code == 400
|
assert response.status_code == 400
|
||||||
|
|
||||||
|
|
||||||
|
def test_restore_without_kind_returns_400(client, tmp_path):
|
||||||
|
"""T105: Missing or empty ``kind`` must be rejected with 400.
|
||||||
|
|
||||||
|
Previously ``kind`` defaulted to ``"periodic"``, which silently 404'd
|
||||||
|
when the caller meant a rewind snapshot. Tighten the contract so the
|
||||||
|
client must always pass an explicit, valid ``kind``.
|
||||||
|
"""
|
||||||
|
db_path = tmp_path / "test.db"
|
||||||
|
_seed_bot(db_path, "bot_a", "BotA")
|
||||||
|
snapshot_path = _take_snapshot_via_service(
|
||||||
|
db_path, tmp_path, kind="periodic"
|
||||||
|
)
|
||||||
|
snapshot_id = snapshot_path.stem
|
||||||
|
|
||||||
|
response = client.post(
|
||||||
|
f"/snapshots/restore/{snapshot_id}",
|
||||||
|
data={"confirm_id": snapshot_id}, # no `kind`
|
||||||
|
follow_redirects=False,
|
||||||
|
)
|
||||||
|
assert response.status_code == 400
|
||||||
|
|
||||||
|
|
||||||
def test_preview_renders_metadata(client, tmp_path):
|
def test_preview_renders_metadata(client, tmp_path):
|
||||||
db_path = tmp_path / "test.db"
|
db_path = tmp_path / "test.db"
|
||||||
_seed_bot(db_path, "bot_a", "BotA")
|
_seed_bot(db_path, "bot_a", "BotA")
|
||||||
|
|||||||
+86
-32
@@ -22,6 +22,7 @@ from chat.db.connection import open_db
|
|||||||
from chat.eventlog.log import append_and_apply, append_event
|
from chat.eventlog.log import append_and_apply, append_event
|
||||||
from chat.eventlog.projector import project
|
from chat.eventlog.projector import project
|
||||||
from chat.llm.mock import MockLLMClient
|
from chat.llm.mock import MockLLMClient
|
||||||
|
from tests.fixtures import CannedQueue
|
||||||
|
|
||||||
|
|
||||||
@pytest.fixture
|
@pytest.fixture
|
||||||
@@ -362,14 +363,20 @@ def test_single_bot_turn_no_guest_regression(app_state_setup, tmp_path):
|
|||||||
the chat has no guest, so ``detect_interjection`` is NOT invoked.
|
the chat has no guest, so ``detect_interjection`` is NOT invoked.
|
||||||
Ends with one user_turn, one assistant_turn, two edge_updates, and a
|
Ends with one user_turn, one assistant_turn, two edge_updates, and a
|
||||||
single ``memory_written``.
|
single ``memory_written``.
|
||||||
|
|
||||||
|
T116: migrated to :class:`tests.fixtures.CannedQueue` as a proof of
|
||||||
|
concept for the structured canned-queue builder.
|
||||||
"""
|
"""
|
||||||
_seed(tmp_path / "test.db")
|
_seed(tmp_path / "test.db")
|
||||||
canned_parse = json.dumps(
|
canned = (
|
||||||
{"segments": [{"kind": "dialogue", "text": "hello"}]}
|
CannedQueue()
|
||||||
)
|
.parse_turn(segments=[{"kind": "dialogue", "text": "hello"}])
|
||||||
mock = _override_llm(
|
.narrative("Hi there.")
|
||||||
[canned_parse, "Hi there.", _zero_state(), _zero_state()]
|
.state_update()
|
||||||
|
.state_update()
|
||||||
|
.build()
|
||||||
)
|
)
|
||||||
|
mock = _override_llm(canned)
|
||||||
try:
|
try:
|
||||||
response = app_state_setup.post(
|
response = app_state_setup.post(
|
||||||
"/chats/chat_bot_a/turns", data={"prose": "hello"}
|
"/chats/chat_bot_a/turns", data={"prose": "hello"}
|
||||||
@@ -734,6 +741,19 @@ def test_cancelled_turn_still_closes_scene_when_user_prose_signals_close(
|
|||||||
that as an exception, so we drive the request inside ``with
|
that as an exception, so we drive the request inside ``with
|
||||||
pytest.raises``. Despite the exception, the scene_closed event
|
pytest.raises``. Despite the exception, the scene_closed event
|
||||||
must land in the event_log.
|
must land in the event_log.
|
||||||
|
|
||||||
|
T108 NOTE — this test does NOT actually exercise the cancel path.
|
||||||
|
``_CancelOnStreamMock.stream`` writes ``raise asyncio.CancelledError``
|
||||||
|
but ``asyncio`` is not imported at module scope, so the first
|
||||||
|
iteration raises ``NameError`` (caught by ``except Exception:`` in
|
||||||
|
post_turn, which sets ``primary_truncated=True`` but leaves
|
||||||
|
``cancelled=False``). The function therefore returns 204 normally,
|
||||||
|
the dependency-managed connection commits, and ``scene_closed``
|
||||||
|
lands. Importing asyncio so the real CancelledError fires reveals
|
||||||
|
a transactional bug: ``post_turn``'s end-of-function re-raise
|
||||||
|
causes ``open_db``'s dependency teardown to skip ``conn.commit()``,
|
||||||
|
rolling back ALL post-cancel writes (user_turn, assistant_turn,
|
||||||
|
edge_updates, scene_closed). Deferred for triage — see T108 report.
|
||||||
"""
|
"""
|
||||||
from typing import AsyncIterator, Sequence
|
from typing import AsyncIterator, Sequence
|
||||||
|
|
||||||
@@ -828,12 +848,33 @@ def test_cancelled_turn_still_closes_scene_when_user_prose_signals_close(
|
|||||||
"SELECT payload_json FROM event_log "
|
"SELECT payload_json FROM event_log "
|
||||||
"WHERE kind = 'assistant_turn' ORDER BY id"
|
"WHERE kind = 'assistant_turn' ORDER BY id"
|
||||||
).fetchall()
|
).fetchall()
|
||||||
|
# T108: pin the ordering — user_turn must commit before
|
||||||
|
# scene_closed (close detection runs on prose that is already
|
||||||
|
# in the event_log) and any assistant_turn the cancel produced
|
||||||
|
# must come last (truncated record written after both).
|
||||||
|
ordered = conn.execute(
|
||||||
|
"SELECT id, kind FROM event_log "
|
||||||
|
"WHERE kind IN ('user_turn', 'scene_closed', 'assistant_turn') "
|
||||||
|
"ORDER BY id"
|
||||||
|
).fetchall()
|
||||||
|
|
||||||
# Scene close lands despite the cancel.
|
# Scene close lands despite the cancel.
|
||||||
assert scene_close_count == 1
|
assert scene_close_count == 1
|
||||||
# The cancelled assistant_turn was still recorded (truncated=True).
|
# The cancelled assistant_turn was still recorded (truncated=True).
|
||||||
assert len(assistant_payload) == 1
|
assert len(assistant_payload) == 1
|
||||||
assert json.loads(assistant_payload[0][0])["truncated"] is True
|
assert json.loads(assistant_payload[0][0])["truncated"] is True
|
||||||
|
# T108 ordering pin: user_turn lands first, the truncated
|
||||||
|
# assistant_turn (if any) is committed BEFORE the scene_close
|
||||||
|
# decision fires, and scene_closed lands last. Close detection
|
||||||
|
# relies on user prose being committed to the event_log BEFORE
|
||||||
|
# the close decision runs — and the cancelled assistant beat is
|
||||||
|
# recorded as a partial before close-detection too.
|
||||||
|
kinds_in_order = [row[1] for row in ordered]
|
||||||
|
user_idx = kinds_in_order.index("user_turn")
|
||||||
|
close_idx = kinds_in_order.index("scene_closed")
|
||||||
|
assert user_idx < close_idx
|
||||||
|
if "assistant_turn" in kinds_in_order:
|
||||||
|
assert user_idx < kinds_in_order.index("assistant_turn") < close_idx
|
||||||
|
|
||||||
|
|
||||||
def test_interjection_enqueues_significance_job(app_state_setup, tmp_path):
|
def test_interjection_enqueues_significance_job(app_state_setup, tmp_path):
|
||||||
@@ -945,29 +986,25 @@ def test_turn_with_event_transition_appends_started_event(
|
|||||||
},
|
},
|
||||||
)
|
)
|
||||||
|
|
||||||
canned_parse = json.dumps(
|
# T116: migrated to :class:`tests.fixtures.CannedQueue`.
|
||||||
{"segments": [{"kind": "dialogue", "text": "they arrived"}]}
|
canned = (
|
||||||
)
|
CannedQueue()
|
||||||
canned_event_decision = json.dumps(
|
.parse_turn(segments=[{"kind": "dialogue", "text": "they arrived"}])
|
||||||
{
|
.narrative("They walk in.")
|
||||||
"transitions": [
|
.state_update()
|
||||||
{
|
.state_update()
|
||||||
"event_id": "evt_1",
|
.detect_event_transitions(
|
||||||
"new_status": "active",
|
[
|
||||||
"reason": "they arrived",
|
{
|
||||||
}
|
"event_id": "evt_1",
|
||||||
]
|
"new_status": "active",
|
||||||
}
|
"reason": "they arrived",
|
||||||
)
|
}
|
||||||
mock = _override_llm(
|
]
|
||||||
[
|
)
|
||||||
canned_parse,
|
.build()
|
||||||
"They walk in.",
|
|
||||||
_zero_state(),
|
|
||||||
_zero_state(),
|
|
||||||
canned_event_decision,
|
|
||||||
]
|
|
||||||
)
|
)
|
||||||
|
mock = _override_llm(canned)
|
||||||
try:
|
try:
|
||||||
response = app_state_setup.post(
|
response = app_state_setup.post(
|
||||||
"/chats/chat_bot_a/turns", data={"prose": "they arrived"}
|
"/chats/chat_bot_a/turns", data={"prose": "they arrived"}
|
||||||
@@ -989,6 +1026,18 @@ def test_turn_with_event_transition_appends_started_event(
|
|||||||
assert started_payload["event_id"] == "evt_1"
|
assert started_payload["event_id"] == "evt_1"
|
||||||
assert started_payload["started_at"] == "2026-04-26T20:00:00+00:00"
|
assert started_payload["started_at"] == "2026-04-26T20:00:00+00:00"
|
||||||
|
|
||||||
|
# T114.1: payload carries the back-reference to the assistant_turn
|
||||||
|
# that triggered the transition. The assistant_turn lands in
|
||||||
|
# event_log immediately before the event_started, so its id is
|
||||||
|
# the largest assistant_turn id in the chat at this point.
|
||||||
|
at_id = conn.execute(
|
||||||
|
"SELECT id FROM event_log "
|
||||||
|
"WHERE kind = 'assistant_turn' "
|
||||||
|
" AND json_extract(payload_json, '$.chat_id') = 'chat_bot_a' "
|
||||||
|
"ORDER BY id DESC LIMIT 1"
|
||||||
|
).fetchone()[0]
|
||||||
|
assert started_payload["triggered_by_assistant_turn_id"] == at_id
|
||||||
|
|
||||||
# The events projection row reflects the active status.
|
# The events projection row reflects the active status.
|
||||||
ev_row = conn.execute(
|
ev_row = conn.execute(
|
||||||
"SELECT status, started_at FROM events WHERE event_id = ?",
|
"SELECT status, started_at FROM events WHERE event_id = ?",
|
||||||
@@ -1109,18 +1158,23 @@ def test_turn_with_no_active_events_skips_classifier(app_state_setup, tmp_path):
|
|||||||
short-circuits without an LLM call (per T52). The canned queue must
|
short-circuits without an LLM call (per T52). The canned queue must
|
||||||
therefore have ZERO event-detection slots — same shape as the
|
therefore have ZERO event-detection slots — same shape as the
|
||||||
Phase 2 no-guest baseline.
|
Phase 2 no-guest baseline.
|
||||||
|
|
||||||
|
T116: migrated to :class:`tests.fixtures.CannedQueue`.
|
||||||
"""
|
"""
|
||||||
_seed(tmp_path / "test.db")
|
_seed(tmp_path / "test.db")
|
||||||
|
|
||||||
canned_parse = json.dumps(
|
|
||||||
{"segments": [{"kind": "dialogue", "text": "hello"}]}
|
|
||||||
)
|
|
||||||
# Only 4 slots: parse + narrative + 2 state-updates. NO extra slot for
|
# Only 4 slots: parse + narrative + 2 state-updates. NO extra slot for
|
||||||
# event-detection — non-existent active_events causes the helper to
|
# event-detection — non-existent active_events causes the helper to
|
||||||
# short-circuit before pulling from the queue.
|
# short-circuit before pulling from the queue.
|
||||||
mock = _override_llm(
|
canned = (
|
||||||
[canned_parse, "Hi there.", _zero_state(), _zero_state()]
|
CannedQueue()
|
||||||
|
.parse_turn(segments=[{"kind": "dialogue", "text": "hello"}])
|
||||||
|
.narrative("Hi there.")
|
||||||
|
.state_update()
|
||||||
|
.state_update()
|
||||||
|
.build()
|
||||||
)
|
)
|
||||||
|
mock = _override_llm(canned)
|
||||||
try:
|
try:
|
||||||
response = app_state_setup.post(
|
response = app_state_setup.post(
|
||||||
"/chats/chat_bot_a/turns", data={"prose": "hello"}
|
"/chats/chat_bot_a/turns", data={"prose": "hello"}
|
||||||
|
|||||||
@@ -73,11 +73,25 @@ async def test_parse_turn_empty_prose_short_circuits_without_classifier_call():
|
|||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
@pytest.mark.asyncio
|
||||||
async def test_parse_turn_raises_when_classifier_fails_twice():
|
async def test_parse_turn_falls_back_to_whole_prose_when_classifier_fails():
|
||||||
|
"""A flapping classifier (3 invalid responses) no longer 500s the
|
||||||
|
request. ``parse_turn`` returns the original prose as a single
|
||||||
|
``dialogue`` segment so the turn flow can keep moving — the
|
||||||
|
narrative will still fire on the prose, just without finer-grained
|
||||||
|
segment classification.
|
||||||
|
|
||||||
|
The old contract was ``RuntimeError`` (no default), but in
|
||||||
|
production that took down the whole turn endpoint with a 500 the
|
||||||
|
moment any classifier provider hiccuped — particularly painful in
|
||||||
|
multi-bot scenes where every user turn pays the parse_turn cost.
|
||||||
|
"""
|
||||||
mock = MockLLMClient(canned=["nope", "still nope", "nope3"])
|
mock = MockLLMClient(canned=["nope", "still nope", "nope3"])
|
||||||
with pytest.raises(RuntimeError):
|
result = await parse_turn(
|
||||||
await parse_turn(
|
mock,
|
||||||
mock,
|
model="m",
|
||||||
model="m",
|
prose='*shrugs* "whatever"',
|
||||||
prose='*shrugs* "whatever"',
|
)
|
||||||
)
|
assert len(result.segments) == 1
|
||||||
|
assert result.segments[0].kind == "dialogue"
|
||||||
|
assert result.segments[0].text == '*shrugs* "whatever"'
|
||||||
|
assert result.intent == "narrative"
|
||||||
|
|||||||
+2
-2
@@ -324,11 +324,11 @@ def test_get_scene_returns_none_for_missing(tmp_path):
|
|||||||
assert active_scene(conn, "chat_missing") is None
|
assert active_scene(conn, "chat_missing") is None
|
||||||
|
|
||||||
|
|
||||||
def test_schema_version_after_migration_is_13(tmp_path):
|
def test_schema_version_after_migration_is_14(tmp_path):
|
||||||
db = tmp_path / "t.db"
|
db = tmp_path / "t.db"
|
||||||
apply_migrations(db)
|
apply_migrations(db)
|
||||||
with open_db(db) as conn:
|
with open_db(db) as conn:
|
||||||
row = conn.execute(
|
row = conn.execute(
|
||||||
"SELECT value FROM meta WHERE key = 'schema_version'"
|
"SELECT value FROM meta WHERE key = 'schema_version'"
|
||||||
).fetchone()
|
).fetchone()
|
||||||
assert int(row[0]) == 13
|
assert int(row[0]) == 14
|
||||||
|
|||||||
Reference in New Issue
Block a user