# Roleplay Engine — Phase 1 Implementation Plan > **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task. **Goal:** Build the v1 (Phase 1) roleplay engine end-to-end — a local-first FastAPI + HTMX app with single-bot chats, persistent bot-owned memory, per-chat clocks, an event-sourced SQLite backend, multi-tab SSE streaming, drawer state surface, rewind / regenerate / reset, and Featherless inference (narrative + classifier models). **Architecture:** Python 3.11+ FastAPI server, SQLite (single file, WAL mode) projected from an append-only event log. Featherless OpenAI-compatible client behind a `LLMClient` interface. Per-chat in-process pub/sub queue broadcasts state changes over SSE to all subscribed browser tabs. State changes always go through events; the projector applies them. TDD: every task starts with a failing test. **Tech Stack:** - Python 3.11+, FastAPI, Uvicorn, HTMX (CDN), Jinja2 templates, vanilla CSS. - SQLite (stdlib `sqlite3`), `aiosqlite` for async paths where useful. - `pydantic` for state schemas, `pydantic-settings` for config. - `instructor` (or Featherless-native JSON-mode) for classifier-constrained output via `openai` SDK pointed at `https://api.featherless.ai/v1`. - `tiktoken` for token accounting. - `pytest`, `pytest-asyncio`, `httpx` (for FastAPI TestClient), `freezegun` for time tests. **Source-of-truth references:** - Requirements: [2026-04-26-v1-requirements-design.md](2026-04-26-v1-requirements-design.md) - Architecture: [../../rp-engine-design.md](../../rp-engine-design.md) - Conventions: [../../CLAUDE.md](../../CLAUDE.md) When a task says "see §X", that's the requirements doc unless stated otherwise. --- ## Pre-flight **Worktree:** This is a greenfield repo on `main`. Branch off into `phase-1` before starting: ```bash git checkout -b phase-1 ``` **Python env:** Use a project-local venv (`/.venv/`). Add `.venv/` and `__pycache__/` to `.gitignore` in T0. **Featherless API key:** Stored in `data/config.toml` (gitignored). The plan creates an example file in T1; you copy it and paste in your real key locally. **TDD discipline:** Every task starts with a failing test. Don't skip step 2 ("run to verify it fails"). If the test passes before implementation, the test is wrong — fix the test first. **Commit cadence:** One commit per task. Commit messages use `feat:`, `chore:`, `test:`, `docs:` prefixes. **Verification before claiming done:** Use `superpowers-extended-cc:verification-before-completion` — run the test command and read its actual output. Do not claim a task complete on hope. --- ## Phase 1A: Foundation ### Task 0: Project skeleton **Files:** - Create: `pyproject.toml` - Create: `.python-version` - Create: `chat/__init__.py` - Create: `chat/app.py` - Create: `tests/__init__.py` - Create: `tests/test_health.py` - Modify: `.gitignore` (add `.venv/`, `__pycache__/`, `*.pyc`, `.pytest_cache/`) **Step 1: Write the failing test** ```python # tests/test_health.py from fastapi.testclient import TestClient from chat.app import app def test_health_endpoint_returns_ok(): client = TestClient(app) response = client.get("/health") assert response.status_code == 200 assert response.json() == {"status": "ok"} ``` **Step 2: Run test to verify it fails** ```bash python -m venv .venv && source .venv/bin/activate pip install fastapi uvicorn[standard] httpx pytest pytest-asyncio pytest tests/test_health.py -v ``` Expected: ImportError on `chat.app` (module doesn't exist). **Step 3: Write minimal implementation** ```python # chat/app.py from fastapi import FastAPI app = FastAPI(title="chat") @app.get("/health") def health(): return {"status": "ok"} ``` `pyproject.toml` minimum: ```toml [project] name = "chat" version = "0.1.0" requires-python = ">=3.11" dependencies = [ "fastapi>=0.110", "uvicorn[standard]>=0.30", "httpx>=0.27", "pydantic>=2.6", "pydantic-settings>=2.2", "openai>=1.30", "instructor>=1.3", "tiktoken>=0.7", "jinja2>=3.1", "aiosqlite>=0.20", ] [project.optional-dependencies] dev = ["pytest>=8", "pytest-asyncio>=0.23", "freezegun>=1.4"] [tool.pytest.ini_options] pythonpath = ["."] asyncio_mode = "auto" ``` **Step 4: Run test to verify it passes** ```bash pip install -e .[dev] pytest tests/test_health.py -v ``` Expected: 1 passed. **Step 5: Commit** ```bash git add pyproject.toml .python-version chat/ tests/ .gitignore git commit -m "feat: project skeleton with health endpoint" ``` --- ### Task 1: Config loading Loads `data/config.toml`, honors `CHAT_DB_PATH` env var override, exposes a `Settings` pydantic model. See requirements §3 / §12. **Files:** - Create: `chat/config.py` - Create: `data/config.example.toml` - Create: `tests/test_config.py` **Step 1: Write the failing test** ```python # tests/test_config.py import os from pathlib import Path import pytest from chat.config import load_settings def test_load_settings_reads_toml(tmp_path, monkeypatch): cfg = tmp_path / "config.toml" cfg.write_text(""" featherless_api_key = "sk-test" narrative_model = "dphn/Dolphin-Mistral-24B-Venice-Edition" classifier_model = "NousResearch/Hermes-3-Llama-3.1-8B" ooc_marker = "((" retrieval_k = 4 """) monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg)) s = load_settings() assert s.featherless_api_key == "sk-test" assert s.narrative_model.startswith("dphn/") assert s.retrieval_k == 4 def test_chat_db_path_env_overrides_default(tmp_path, monkeypatch): monkeypatch.setenv("CHAT_DB_PATH", str(tmp_path / "alt.db")) monkeypatch.setenv("CHAT_CONFIG_PATH", str(tmp_path / "config.toml")) (tmp_path / "config.toml").write_text('featherless_api_key = "x"\n') s = load_settings() assert s.db_path == tmp_path / "alt.db" ``` **Step 2: Run test to verify it fails** ```bash pytest tests/test_config.py -v ``` Expected: ImportError or AttributeError. **Step 3: Write minimal implementation** ```python # chat/config.py from __future__ import annotations import os import tomllib from pathlib import Path from pydantic import BaseModel, Field REPO_ROOT = Path(__file__).resolve().parent.parent DEFAULT_CONFIG = REPO_ROOT / "data" / "config.toml" DEFAULT_DB = REPO_ROOT / "data" / "chat.db" class Settings(BaseModel): featherless_api_key: str featherless_base_url: str = "https://api.featherless.ai/v1" narrative_model: str = "dphn/Dolphin-Mistral-24B-Venice-Edition" classifier_model: str = "NousResearch/Hermes-3-Llama-3.1-8B" classifier_fallbacks: list[str] = Field( default_factory=lambda: [ "cognitivecomputations/dolphin-2.9.4-llama3-8b", "mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated", ] ) ooc_marker: str = "((" retrieval_k: int = 4 narrative_budget_hard: int = 8000 narrative_budget_soft: int = 6000 classifier_budget_hard: int = 4000 classifier_timeout_s: float = 10.0 db_path: Path = DEFAULT_DB data_dir: Path = REPO_ROOT / "data" bind_host: str = "127.0.0.1" bind_port: int = 8000 def load_settings() -> Settings: config_path = Path(os.environ.get("CHAT_CONFIG_PATH", DEFAULT_CONFIG)) raw: dict = {} if config_path.exists(): raw = tomllib.loads(config_path.read_text()) if "CHAT_DB_PATH" in os.environ: raw["db_path"] = Path(os.environ["CHAT_DB_PATH"]) return Settings(**raw) ``` `data/config.example.toml`: ```toml # Copy this file to data/config.toml and fill in your API key. featherless_api_key = "REPLACE_ME" narrative_model = "dphn/Dolphin-Mistral-24B-Venice-Edition" classifier_model = "NousResearch/Hermes-3-Llama-3.1-8B" ooc_marker = "((" retrieval_k = 4 ``` **Step 4: Run test to verify it passes** ```bash pytest tests/test_config.py -v ``` Expected: 2 passed. **Step 5: Commit** ```bash git add chat/config.py data/config.example.toml tests/test_config.py git commit -m "feat: config loader with toml + env override" ``` --- ### Task 2: SQLite migrations framework Establishes a forward-only migration runner reading SQL files from `chat/db/migrations/`, tracked in a `meta` table (key/value). **Files:** - Create: `chat/db/__init__.py` - Create: `chat/db/connection.py` - Create: `chat/db/migrate.py` - Create: `chat/db/migrations/0001_init_meta.sql` - Create: `tests/test_migrate.py` **Step 1: Write the failing test** ```python # tests/test_migrate.py from chat.db.connection import open_db from chat.db.migrate import apply_migrations def test_apply_migrations_creates_meta_table(tmp_path): db = tmp_path / "test.db" apply_migrations(db) with open_db(db) as conn: row = conn.execute( "SELECT value FROM meta WHERE key = 'schema_version'" ).fetchone() assert row is not None assert int(row[0]) >= 1 def test_apply_migrations_idempotent(tmp_path): db = tmp_path / "test.db" apply_migrations(db) apply_migrations(db) # second call must be a no-op with open_db(db) as conn: count = conn.execute("SELECT COUNT(*) FROM meta").fetchone()[0] assert count == 1 ``` **Step 2: Run test to verify it fails** ```bash pytest tests/test_migrate.py -v ``` Expected: ImportError. **Step 3: Write minimal implementation** ```python # chat/db/connection.py from __future__ import annotations import sqlite3 from contextlib import contextmanager from pathlib import Path @contextmanager def open_db(path: Path): path.parent.mkdir(parents=True, exist_ok=True) conn = sqlite3.connect(path) conn.execute("PRAGMA journal_mode=WAL") conn.execute("PRAGMA foreign_keys=ON") try: yield conn conn.commit() finally: conn.close() ``` ```python # chat/db/migrate.py from __future__ import annotations from pathlib import Path from chat.db.connection import open_db MIGRATIONS_DIR = Path(__file__).parent / "migrations" def apply_migrations(db_path: Path) -> None: with open_db(db_path) as conn: conn.execute( "CREATE TABLE IF NOT EXISTS meta (key TEXT PRIMARY KEY, value TEXT)" ) cur = conn.execute("SELECT value FROM meta WHERE key = 'schema_version'") row = cur.fetchone() current = int(row[0]) if row else 0 for path in sorted(MIGRATIONS_DIR.glob("*.sql")): version = int(path.stem.split("_", 1)[0]) if version <= current: continue sql = path.read_text() conn.executescript(sql) conn.execute( "INSERT OR REPLACE INTO meta (key, value) VALUES ('schema_version', ?)", (str(version),), ) ``` ```sql -- chat/db/migrations/0001_init_meta.sql -- meta table is created by the migrate runner; this migration is a marker. SELECT 1; ``` **Step 4: Run test to verify it passes** ```bash pytest tests/test_migrate.py -v ``` Expected: 2 passed. **Step 5: Commit** ```bash git add chat/db/ tests/test_migrate.py git commit -m "feat: sqlite migration runner with meta version table" ``` --- ### Task 3: Featherless client with mock Defines `LLMClient` protocol with `generate(messages, params, stream=False)` and `generate_structured(messages, schema)`. Implementations: `FeatherlessClient` (real), `MockLLMClient` (test). **Files:** - Create: `chat/llm/__init__.py` - Create: `chat/llm/client.py` - Create: `chat/llm/featherless.py` - Create: `chat/llm/mock.py` - Create: `tests/test_llm_mock.py` **Step 1: Write the failing test** ```python # tests/test_llm_mock.py import pytest from chat.llm.mock import MockLLMClient from chat.llm.client import Message @pytest.mark.asyncio async def test_mock_returns_canned_response(): client = MockLLMClient(canned=["Hello, world."]) msgs = [Message(role="user", content="hi")] out = await client.generate(msgs, model="any") assert out == "Hello, world." @pytest.mark.asyncio async def test_mock_streams_tokens(): client = MockLLMClient(canned=["abcd"]) msgs = [Message(role="user", content="hi")] chunks = [] async for chunk in client.stream(msgs, model="any"): chunks.append(chunk) assert "".join(chunks) == "abcd" ``` **Step 2: Run test to verify it fails** ```bash pytest tests/test_llm_mock.py -v ``` Expected: ImportError. **Step 3: Write minimal implementation** ```python # chat/llm/client.py from __future__ import annotations from dataclasses import dataclass from typing import Protocol, AsyncIterator, Sequence @dataclass class Message: role: str # "system" | "user" | "assistant" content: str class LLMClient(Protocol): async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str: ... def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]: ... ``` ```python # chat/llm/mock.py from __future__ import annotations from typing import AsyncIterator, Sequence from .client import Message class MockLLMClient: def __init__(self, canned: list[str]): self._canned = list(canned) async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str: return self._canned.pop(0) async def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]: text = self._canned.pop(0) for ch in text: yield ch ``` ```python # chat/llm/featherless.py from __future__ import annotations from typing import AsyncIterator, Sequence from openai import AsyncOpenAI from .client import Message class FeatherlessClient: def __init__(self, api_key: str, base_url: str = "https://api.featherless.ai/v1"): self._client = AsyncOpenAI(api_key=api_key, base_url=base_url) async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str: resp = await self._client.chat.completions.create( model=model, messages=[{"role": m.role, "content": m.content} for m in messages], **params, ) return resp.choices[0].message.content or "" async def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]: stream = await self._client.chat.completions.create( model=model, messages=[{"role": m.role, "content": m.content} for m in messages], stream=True, **params, ) async for chunk in stream: delta = chunk.choices[0].delta.content or "" if delta: yield delta ``` **Step 4: Run test to verify it passes** ```bash pytest tests/test_llm_mock.py -v ``` Expected: 2 passed. **Step 5: Commit** ```bash git add chat/llm/ tests/test_llm_mock.py git commit -m "feat: LLMClient protocol with Featherless and mock implementations" ``` --- ### Task 4: Classifier service wrapper Wraps the classifier model with retry, timeout, and Pydantic-constrained output (per requirements §3.3). Falls back to schema-default on persistent failure. Logs failures to `classifier_failures` table. **Files:** - Create: `chat/db/migrations/0002_classifier_failures.sql` - Create: `chat/llm/classify.py` - Create: `tests/test_classify.py` **Step 1: Write the failing test** ```python # tests/test_classify.py import pytest from pydantic import BaseModel from chat.llm.mock import MockLLMClient from chat.llm.classify import classify class Verdict(BaseModel): score: int reason: str @pytest.mark.asyncio async def test_classify_parses_valid_json(): mock = MockLLMClient(canned=['{"score": 2, "reason": "notable"}']) result = await classify(mock, model="m", system="x", user="y", schema=Verdict) assert result.score == 2 @pytest.mark.asyncio async def test_classify_falls_back_on_unparseable_after_retry(): mock = MockLLMClient(canned=["nope", "still nope"]) default = Verdict(score=1, reason="fallback") result = await classify(mock, model="m", system="x", user="y", schema=Verdict, default=default) assert result.reason == "fallback" ``` **Step 2: Run test to verify it fails** ```bash pytest tests/test_classify.py -v ``` Expected: ImportError. **Step 3: Write minimal implementation** `chat/db/migrations/0002_classifier_failures.sql`: ```sql CREATE TABLE classifier_failures ( id INTEGER PRIMARY KEY, kind TEXT NOT NULL, model TEXT NOT NULL, raw_text TEXT, attempt_count INTEGER NOT NULL, created_at TEXT NOT NULL DEFAULT (datetime('now')) ); ``` `chat/llm/classify.py`: ```python from __future__ import annotations import json import asyncio from typing import TypeVar from pydantic import BaseModel, ValidationError from .client import LLMClient, Message T = TypeVar("T", bound=BaseModel) REFUSAL_PATTERNS = ("i can't", "i cannot", "i'm sorry, but", "as an ai") async def classify( client: LLMClient, *, model: str, system: str, user: str, schema: type[T], default: T | None = None, timeout_s: float = 10.0, ) -> T: msgs = [ Message(role="system", content=system + "\n\nRespond with JSON only matching the schema."), Message(role="user", content=user), ] for attempt in range(2): try: text = await asyncio.wait_for( client.generate(msgs, model=model, response_format={"type": "json_object"}), timeout=timeout_s, ) if any(p in text.lower()[:80] for p in REFUSAL_PATTERNS) and not text.strip().startswith("{"): raise ValueError("refusal-shaped response") return schema.model_validate_json(text) except (ValidationError, ValueError, json.JSONDecodeError, asyncio.TimeoutError): msgs[0] = Message(role="system", content=system + "\n\nRespond with valid JSON ONLY. No prose.") continue if default is None: raise RuntimeError(f"classify failed for schema {schema.__name__} with no default") return default ``` **Step 4: Run test to verify it passes** ```bash pytest tests/test_classify.py -v ``` Expected: 2 passed. **Step 5: Commit** ```bash git add chat/llm/classify.py chat/db/migrations/0002_classifier_failures.sql tests/test_classify.py git commit -m "feat: classifier wrapper with retry, timeout, schema-default fallback" ``` --- ## Phase 1B: Event log & state machine ### Task 5: Event log + projector skeleton Append-only event log with one row per event (`id`, `branch_id`, `ts`, `kind`, `payload_json`). Projector framework that dispatches per-kind handlers; initial registry is empty. State changes ALWAYS go through `append_event`. **Files:** - Create: `chat/db/migrations/0003_event_log.sql` - Create: `chat/eventlog/__init__.py` - Create: `chat/eventlog/log.py` - Create: `chat/eventlog/projector.py` - Create: `tests/test_eventlog.py` **Step 1: Write the failing test** ```python # tests/test_eventlog.py from chat.db.migrate import apply_migrations from chat.db.connection import open_db from chat.eventlog.log import append_event, read_events def test_append_and_read(tmp_path): db = tmp_path / "t.db" apply_migrations(db) with open_db(db) as conn: eid = append_event(conn, kind="test_kind", payload={"a": 1}) assert eid > 0 rows = list(read_events(conn)) assert len(rows) == 1 assert rows[0].kind == "test_kind" assert rows[0].payload["a"] == 1 ``` **Step 2: Run test to verify it fails** Expected: missing migration / module. **Step 3: Write minimal implementation** `chat/db/migrations/0003_event_log.sql`: ```sql CREATE TABLE event_log ( id INTEGER PRIMARY KEY, branch_id INTEGER NOT NULL DEFAULT 1, ts TEXT NOT NULL DEFAULT (datetime('now')), kind TEXT NOT NULL, payload_json TEXT NOT NULL, superseded_by INTEGER REFERENCES event_log(id), hidden INTEGER NOT NULL DEFAULT 0 ); CREATE INDEX idx_event_log_branch_kind ON event_log(branch_id, kind); ``` `chat/eventlog/log.py`: ```python from __future__ import annotations import json from dataclasses import dataclass from typing import Any, Iterator from sqlite3 import Connection @dataclass class Event: id: int branch_id: int ts: str kind: str payload: dict[str, Any] superseded_by: int | None hidden: bool def append_event(conn: Connection, *, kind: str, payload: dict[str, Any], branch_id: int = 1) -> int: cur = conn.execute( "INSERT INTO event_log (branch_id, kind, payload_json) VALUES (?, ?, ?)", (branch_id, kind, json.dumps(payload)), ) return cur.lastrowid def read_events(conn: Connection, branch_id: int = 1, after_id: int = 0) -> Iterator[Event]: cur = conn.execute( "SELECT id, branch_id, ts, kind, payload_json, superseded_by, hidden " "FROM event_log WHERE branch_id = ? AND id > ? AND hidden = 0 " "AND superseded_by IS NULL ORDER BY id", (branch_id, after_id), ) for row in cur: yield Event( id=row[0], branch_id=row[1], ts=row[2], kind=row[3], payload=json.loads(row[4]), superseded_by=row[5], hidden=bool(row[6]), ) ``` `chat/eventlog/projector.py`: ```python from __future__ import annotations from collections.abc import Callable from sqlite3 import Connection from .log import Event, read_events Handler = Callable[[Connection, Event], None] _REGISTRY: dict[str, Handler] = {} def on(kind: str): def deco(fn: Handler) -> Handler: _REGISTRY[kind] = fn return fn return deco def project(conn: Connection, branch_id: int = 1) -> None: for event in read_events(conn, branch_id=branch_id): h = _REGISTRY.get(event.kind) if h: h(conn, event) def apply_event(conn: Connection, event: Event) -> None: h = _REGISTRY.get(event.kind) if h: h(conn, event) ``` **Step 4: Run test to verify it passes** ```bash pytest tests/test_eventlog.py -v ``` Expected: 1 passed. **Step 5: Commit** ```bash git add chat/eventlog/ chat/db/migrations/0003_event_log.sql tests/test_eventlog.py git commit -m "feat: append-only event log with projector skeleton" ``` --- ### Task 6: Bot + You entity schemas and events Adds `bots` and `you_entity` projected tables, `bot_authored` and `you_authored` event kinds. Identity is immutable per session — re-authoring writes a new event. **Files:** - Create: `chat/db/migrations/0004_entities.sql` - Create: `chat/state/__init__.py` - Create: `chat/state/entities.py` - Modify: `chat/eventlog/projector.py` (import handlers) - Create: `tests/test_entities.py` **Step 1: Write the failing test** ```python # tests/test_entities.py from chat.db.migrate import apply_migrations from chat.db.connection import open_db from chat.eventlog.log import append_event from chat.eventlog.projector import project from chat.state.entities import get_bot, list_bots, get_you import chat.state.entities # registers handlers def test_bot_authored_creates_bot_row(tmp_path): db = tmp_path / "t.db" apply_migrations(db) with open_db(db) as conn: append_event(conn, kind="bot_authored", payload={ "id": "bot_a", "name": "BotA", "persona": "...", "voice_samples": ["sample"], "traits": ["shy"], "backstory": "...", "initial_relationship_to_you": "coworker", "kickoff_prose": "you stay late", }) project(conn) bot = get_bot(conn, "bot_a") assert bot is not None assert bot["name"] == "BotA" assert bot["traits"] == ["shy"] assert "bot_a" in [b["id"] for b in list_bots(conn)] def test_you_authored_creates_you_singleton(tmp_path): db = tmp_path / "t.db" apply_migrations(db) with open_db(db) as conn: append_event(conn, kind="you_authored", payload={ "name": "Me", "pronouns": "they/them", "persona": "engineer", }) project(conn) you = get_you(conn) assert you is not None assert you["name"] == "Me" ``` **Step 2: Run, verify fail.** **Step 3: Implementation.** `chat/db/migrations/0004_entities.sql`: ```sql CREATE TABLE bots ( id TEXT PRIMARY KEY, name TEXT NOT NULL, persona TEXT NOT NULL, voice_samples_json TEXT NOT NULL DEFAULT '[]', traits_json TEXT NOT NULL DEFAULT '[]', backstory TEXT NOT NULL DEFAULT '', initial_relationship_to_you TEXT NOT NULL DEFAULT '', kickoff_prose TEXT NOT NULL DEFAULT '', created_at TEXT NOT NULL DEFAULT (datetime('now')) ); CREATE TABLE you_entity ( id INTEGER PRIMARY KEY CHECK (id = 1), name TEXT NOT NULL, pronouns TEXT NOT NULL DEFAULT '', persona TEXT NOT NULL DEFAULT '' ); ``` `chat/state/entities.py`: ```python from __future__ import annotations import json from sqlite3 import Connection from chat.eventlog.projector import on from chat.eventlog.log import Event @on("bot_authored") def _apply_bot_authored(conn: Connection, e: Event) -> None: p = e.payload conn.execute( "INSERT OR REPLACE INTO bots " "(id, name, persona, voice_samples_json, traits_json, backstory, " " initial_relationship_to_you, kickoff_prose) " "VALUES (?, ?, ?, ?, ?, ?, ?, ?)", (p["id"], p["name"], p["persona"], json.dumps(p.get("voice_samples", [])), json.dumps(p.get("traits", [])), p.get("backstory", ""), p.get("initial_relationship_to_you", ""), p.get("kickoff_prose", "")), ) @on("you_authored") def _apply_you_authored(conn: Connection, e: Event) -> None: p = e.payload conn.execute( "INSERT OR REPLACE INTO you_entity (id, name, pronouns, persona) VALUES (1, ?, ?, ?)", (p["name"], p.get("pronouns", ""), p.get("persona", "")), ) def get_bot(conn: Connection, bot_id: str) -> dict | None: row = conn.execute("SELECT * FROM bots WHERE id = ?", (bot_id,)).fetchone() if not row: return None cols = [c[1] for c in conn.execute("PRAGMA table_info(bots)").fetchall()] d = dict(zip(cols, row)) d["voice_samples"] = json.loads(d.pop("voice_samples_json")) d["traits"] = json.loads(d.pop("traits_json")) return d def list_bots(conn: Connection) -> list[dict]: cur = conn.execute("SELECT id, name FROM bots ORDER BY name") return [{"id": r[0], "name": r[1]} for r in cur] def get_you(conn: Connection) -> dict | None: row = conn.execute("SELECT name, pronouns, persona FROM you_entity WHERE id = 1").fetchone() if not row: return None return {"name": row[0], "pronouns": row[1], "persona": row[2]} ``` **Step 4: Run, verify pass.** **Step 5: Commit.** ```bash git add chat/db/migrations/0004_entities.sql chat/state/ tests/test_entities.py git commit -m "feat: bot and you entity schemas with projector handlers" ``` --- ### Task 7: Edges schema + per-turn deltas Per requirements §3.4. Edges table holds per-pair directed state. `edge_update` event applies deltas (affinity, trust, knowledge_facts, last_interaction). Summary rewrites are a separate event kind written at scene close (T27). **Files:** - Create: `chat/db/migrations/0005_edges.sql` - Create: `chat/state/edges.py` - Create: `tests/test_edges.py` **Test sketch:** ```python def test_edge_update_applies_affinity_delta(tmp_path): # bot_authored, you_authored, then edge_update with affinity_delta=+5 # assert edges row exists with affinity=initial+5 ``` **Implementation sketch:** ```sql CREATE TABLE edges ( id INTEGER PRIMARY KEY, chat_id TEXT, -- null for default initial seed source_id TEXT NOT NULL, target_id TEXT NOT NULL, affinity INTEGER NOT NULL DEFAULT 50, trust INTEGER NOT NULL DEFAULT 50, summary TEXT NOT NULL DEFAULT '', knowledge_json TEXT NOT NULL DEFAULT '[]', last_interaction_chat_id TEXT, last_interaction_at TEXT, UNIQUE (source_id, target_id) ); ``` ```python @on("edge_update") def _apply_edge_update(conn, e): p = e.payload # upsert + apply deltas; clamp affinity/trust to 0..100 # append knowledge_facts if any # bump last_interaction fields ``` **Commit:** `feat: directed edges with per-turn delta projector` --- ### Task 8: Memory schema + witness flag Memories are bot-owned. Witnessed-by mask stored per memory. **Files:** - Create: `chat/db/migrations/0006_memories.sql` - Create: `chat/state/memory.py` - Create: `tests/test_memory.py` **Schema:** ```sql CREATE TABLE memories ( id INTEGER PRIMARY KEY, owner_id TEXT NOT NULL, -- bot id whose POV this is chat_id TEXT NOT NULL, scene_id INTEGER, pov_summary TEXT NOT NULL, witness_you INTEGER NOT NULL, witness_host INTEGER NOT NULL, witness_guest INTEGER NOT NULL, chat_clock_at TEXT, source TEXT, -- e.g. "direct" | "told_by:bot_id" reliability REAL NOT NULL DEFAULT 1.0, significance INTEGER NOT NULL DEFAULT 1, pinned INTEGER NOT NULL DEFAULT 0, auto_pinned INTEGER NOT NULL DEFAULT 0, created_at TEXT NOT NULL DEFAULT (datetime('now')) ); CREATE INDEX idx_memories_owner ON memories(owner_id); -- FTS5 index on pov_summary, scoped by owner_id CREATE VIRTUAL TABLE memories_fts USING fts5( pov_summary, content='memories', content_rowid='id' ); CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN INSERT INTO memories_fts(rowid, pov_summary) VALUES (new.id, new.pov_summary); END; CREATE TRIGGER memories_au AFTER UPDATE ON memories BEGIN INSERT INTO memories_fts(memories_fts, rowid, pov_summary) VALUES('delete', old.id, old.pov_summary); INSERT INTO memories_fts(rowid, pov_summary) VALUES (new.id, new.pov_summary); END; CREATE TRIGGER memories_ad AFTER DELETE ON memories BEGIN INSERT INTO memories_fts(memories_fts, rowid, pov_summary) VALUES('delete', old.id, old.pov_summary); END; ``` **`memory_written` event handler** + helper functions: ```python @on("memory_written") def _apply_memory_written(conn, e): ... def get_pinned(conn, owner_id) -> list[dict]: ... def search_memories(conn, owner_id: str, witness_role: str, query: str, k: int = 4) -> list[dict]: """FTS5 search filtered by witness bit. witness_role in {'you','host','guest'}.""" ``` **Tests:** write a memory event with witness `[1,1,0]`, assert search returns it for owner; assert search filtered by `witness_guest=1` excludes it. **Commit:** `feat: memory schema with witness flags and FTS5 index` --- ### Task 9: Activity, container, scene, chat schemas Adds the per-chat structural tables: `chats`, `chat_state`, `containers`, `scenes`, `activity`. Plus event handlers for `chat_created`, `container_created`, `activity_change`, `scene_opened`, `scene_closed`. **Files:** - Create: `chat/db/migrations/0007_world.sql` - Create: `chat/state/world.py` - Create: `tests/test_world.py` **Schema (key columns):** ```sql CREATE TABLE chats ( id TEXT PRIMARY KEY, -- e.g. "chat_botA" host_bot_id TEXT NOT NULL, guest_bot_id TEXT, -- null when no guest created_at TEXT NOT NULL DEFAULT (datetime('now')) ); CREATE TABLE chat_state ( chat_id TEXT PRIMARY KEY, time TEXT NOT NULL, -- ISO 8601 UTC weather TEXT NOT NULL DEFAULT '', active_scene_id INTEGER, narrative_anchor TEXT -- the in-fiction "Day 1 = ..." reference ); CREATE TABLE containers ( id INTEGER PRIMARY KEY, chat_id TEXT NOT NULL, name TEXT NOT NULL, type TEXT NOT NULL, properties_json TEXT NOT NULL DEFAULT '{}', parent_id INTEGER REFERENCES containers(id) ); CREATE TABLE scenes ( id INTEGER PRIMARY KEY, chat_id TEXT NOT NULL, container_id INTEGER REFERENCES containers(id), started_at TEXT NOT NULL, ended_at TEXT, significance INTEGER NOT NULL DEFAULT 0, participants_json TEXT NOT NULL DEFAULT '[]' ); CREATE TABLE activity ( entity_id TEXT PRIMARY KEY, -- "you" or bot_id container_id INTEGER REFERENCES containers(id), slot TEXT, posture TEXT NOT NULL DEFAULT '', action_json TEXT NOT NULL DEFAULT '{}', attention TEXT NOT NULL DEFAULT '', holding_json TEXT NOT NULL DEFAULT '[]', status_json TEXT NOT NULL DEFAULT '{}', updated_at TEXT NOT NULL DEFAULT (datetime('now')) ); ``` **Handlers:** `chat_created`, `container_created`, `activity_change`, `scene_opened`, `scene_closed`. **Tests:** create chat → chat_state initialized; create container; activity_change updates `activity` row. **Commit:** `feat: chats, chat_state, containers, scenes, activity tables` --- ## Phase 1C: Authoring ### Task 10: Kickoff prose parser Classifier call that converts authored kickoff prose into structured `{container, activity_per_entity, edge_seed}` for confirmation. **Files:** - Create: `chat/services/kickoff.py` - Create: `tests/test_kickoff.py` **Schema returned by classifier:** ```python class KickoffParse(BaseModel): container_name: str container_type: str container_properties: dict # moving, public, audible_range you_activity: ActivityShape bot_activity: ActivityShape initial_time_iso: str edge_seed_summary: str edge_seed_knowledge_facts: list[str] class ActivityShape(BaseModel): posture: str action_verb: str action_interruptible: bool action_required_attention: str # low|medium|high action_expected_duration: str attention: str = "" holding: list[str] = [] ``` **Implementation:** call `classify(...)` with a prompt that includes the bot's persona + relationship-to-you + kickoff prose. Return the parsed model. **Test:** mock client returns canned JSON; assert structured fields populate. **Commit:** `feat: kickoff prose parser via classifier` --- ### Task 11: Bot authoring page Form-based authoring UI; on submit, validates and writes `bot_authored` event. After save, redirects to kickoff parse-and-confirm (T13). **Files:** - Create: `chat/templates/base.html` - Create: `chat/templates/bot_form.html` - Create: `chat/web/__init__.py` - Create: `chat/web/bots.py` - Modify: `chat/app.py` (mount router, jinja env, static files) - Create: `chat/static/app.css` - Create: `tests/test_bot_authoring.py` **Test:** POST to `/bots/new` with form fields; assert `bot_authored` event appended and bot row exists; response redirects to `/bots//kickoff`. **Implementation note:** form fields map to identity per §5.1 (name, persona, voice_samples textarea split on `---`, traits comma-separated, backstory, initial relationship to you, kickoff prose). **Commit:** `feat: bot authoring form with bot_authored event` --- ### Task 12: You-entity authoring (Settings page) Single-row form for the "you" entity. Lives at `/settings`. POST writes `you_authored` event. **Files:** - Create: `chat/templates/settings.html` - Create: `chat/web/settings.py` - Modify: `chat/app.py` - Create: `tests/test_settings.py` **Commit:** `feat: settings page with you-entity authoring` --- ### Task 13: Kickoff parse-and-confirm flow After bot authoring, the user lands on `/bots//kickoff` which shows the parsed kickoff in editable form. On confirm: append `chat_created`, `container_created`, `activity_change` (per entity), `scene_opened`, and an initial `edge_update` (the seed). **Files:** - Create: `chat/templates/kickoff_confirm.html` - Create: `chat/web/kickoff.py` - Create: `tests/test_kickoff_confirm.py` **Test:** Submit a confirmed kickoff payload; assert chat exists, chat_state has time, container exists, activity rows present for you + bot, scene is open, edge has seed summary. **Commit:** `feat: kickoff parse-and-confirm flow with chat creation` --- ## Phase 1D: Chat — single bot ### Task 14: Top-level nav + Chat list Persistent left rail with three sections (§16.1). Chat list pulls from `chats` joined with `chat_state` and the latest assistant_turn for snippet. **Files:** - Create: `chat/templates/layout.html` (extends base, adds rail) - Create: `chat/templates/chat_list.html` - Create: `chat/templates/bot_list.html` - Create: `chat/web/nav.py` - Modify: `chat/app.py` - Create: `tests/test_chat_list.py` **Commit:** `feat: top-level nav and chat list view` --- ### Task 15: Chat shell page `/chats/` — renders the empty timeline + input box + drawer toggle. No turn handling yet. **Files:** - Create: `chat/templates/chat.html` - Create: `chat/web/chat.py` - Create: `tests/test_chat_shell.py` **Commit:** `feat: chat shell page rendering` --- ### Task 16: Per-chat SSE channel + multi-tab sync In-process pub/sub: one `asyncio.Queue` per chat_id, broadcasting events to all subscribers. Endpoint `/chats//events` SSE-streams a JSON event stream. On connect, server pushes a `snapshot` event with current state; subsequent state changes push `event` items. **Files:** - Create: `chat/web/sse.py` - Create: `chat/web/pubsub.py` - Modify: `chat/web/chat.py` - Create: `tests/test_sse.py` **Test:** TestClient streams 1 event; assert framing is correct (`event: snapshot\ndata: {...}\n\n`). **Commit:** `feat: per-chat SSE channel and pub/sub` --- ### Task 17: Turn input parser Classifier call that splits a user turn into `[dialogue|action|ooc]` segments. OOC segments stripped from prompt; flagged for transcript display only. **Files:** - Create: `chat/services/turn_parse.py` - Create: `tests/test_turn_parse.py` **Schema:** ```python class TurnSegment(BaseModel): kind: str # dialogue|action|ooc text: str class ParsedTurn(BaseModel): segments: list[TurnSegment] ``` **Test:** input `*walks over* "Hey." ((player note))` → 3 segments tagged correctly. Mock classifier returns canned JSON. **Commit:** `feat: turn input parser via classifier` --- ### Task 18: Prompt assembly with trim tiers Implements the must/should/nice trimming tiers (§3.2) for the narrative prompt. Token-counts via tiktoken. Inputs: speaker_id, current chat state, witnessed memories (top-K), recent dialogue, edges, activity for all present, active scene. **Files:** - Create: `chat/services/prompt.py` - Create: `tests/test_prompt.py` **Test:** stuff a huge dialogue history, assert older turns get summarized first (NICE), then memories drop to K=2, etc. Must-include never trimmed. **Commit:** `feat: prompt assembly with must/should/nice trim tiers` --- ### Task 19: Narrative call + streaming over SSE POST `/chats//turns` accepts a user prose turn. Server: 1. Appends `user_turn` event (raw + parsed segments). 2. Appends a placeholder `assistant_turn_started` event. 3. Streams narrative tokens over the chat's SSE channel as they arrive. 4. On stream complete: appends `assistant_turn` event with full text + `truncated=False`. 5. On stream interrupt: appends `assistant_turn` with `truncated=True`. **Files:** - Create: `chat/web/turns.py` - Modify: `chat/web/sse.py` (add token broadcast) - Modify: `chat/eventlog/log.py` (add helpers if needed) - Create: `tests/test_turn_flow.py` **Test (uses MockLLMClient):** POST a turn → assert SSE channel emits token chunks then a final `assistant_turn` event; DB has both events. **Commit:** `feat: narrative streaming via SSE with assistant_turn event` --- ## Phase 1E: State updates per turn ### Task 20: Post-turn state-update pass After narrative completes, classifier extracts `affinity_delta`, `trust_delta`, `knowledge_facts` per (source, target) directed pair, for **every present entity** (silent witnesses too). Emits `edge_update` events. **Files:** - Create: `chat/services/state_update.py` - Create: `tests/test_state_update.py` **Test:** mock returns deltas; assert `edge_update` events appended; projection updates affinity. **Commit:** `feat: post-turn state-update pass per present entity` --- ### Task 21: Memory write per turn After narrative completes, write a memory row for each witness who's "owner" with appropriate witness flags. Phase 1 simplification: the memory's `pov_summary` is the assistant's narrative text snippet (significance default 1; classifier rewrites at scene close into per-POV summary form). Emits `memory_written` events. **Files:** - Create: `chat/services/memory_write.py` - Create: `tests/test_memory_write.py` **Commit:** `feat: per-turn memory writes with witness flags` --- ### Task 22: Significance pass (queued, async) Background task: after narrative completes, runs significance classifier (0–3 per §11.1) on the turn. Updates the just-written memory's `significance`. Auto-pins on score 3 (with the soft-cap eviction rule from §8.5). **Files:** - Create: `chat/services/significance.py` - Create: `chat/services/background.py` (asyncio queue worker) - Modify: `chat/app.py` (lifespan starts/stops worker) - Create: `tests/test_significance.py` **Test:** queue a significance job for a freshly-written memory; assert significance updates and auto-pin behavior on score 3. **Commit:** `feat: async significance pass with auto-pin on score 3` --- ### Task 23: Memory retrieval (FTS5, witness-filtered, top-K) Implements `search_memories(owner_id, witness_role, query, k)` via FTS5 with `WHERE` filter on the witness column. Recency + significance boost in ranking. **Files:** - Modify: `chat/state/memory.py` - Create: `tests/test_memory_search.py` **Test:** seed memories with mixed witness flags; assert filter excludes non-witnessed; assert recency boost orders newer above older. **Commit:** `feat: FTS5 memory retrieval with witness filter and ranking boosts` --- ## Phase 1F: Drawer & state ops ### Task 24: Drawer read-only skeleton Right-side drawer rendered as a partial; HTMX-loaded into the chat page. Shows current scene, container, activity per entity, edges (host ↔ you), recent witnessed memories with significance markers, pinned memories with `n/8` counter. **Files:** - Create: `chat/templates/drawer.html` - Create: `chat/web/drawer.py` - Modify: `chat/templates/chat.html` (drawer toggle + container) - Modify: `chat/static/app.css` - Create: `tests/test_drawer_render.py` **Commit:** `feat: read-only drawer with scene, activity, edges, memories` --- ### Task 25: Drawer edits (activity / edges / memory) Inline edit affordances on activity, edge fields, memory pov_summary/significance/pin. Each edit emits a `manual_edit` event with prior value snapshotted (per §6.4 final paragraph). Pin toggle emits `memory_pin_changed` event. **Files:** - Modify: `chat/web/drawer.py` - Modify: `chat/templates/drawer.html` - Create: `chat/state/manual_edit.py` (handler for `manual_edit` event) - Create: `tests/test_drawer_edits.py` **Test:** edit affinity slider via POST; assert `manual_edit` event written with prior + new value; projected affinity updated. **Commit:** `feat: drawer edits with manual_edit event capture` --- ### Task 26: Scene close (hard signals + manual button) Hard-signal detection runs as a small classifier call after each turn (queued/cheap): does the prose indicate container change, explicit "we're done here" pattern, or other hard signal? Manual close button in drawer always available. On close, emit `scene_closed` event; reopen via `scene_opened` for the new scene. **Files:** - Create: `chat/services/scene_close.py` - Modify: `chat/web/turns.py` - Modify: `chat/web/drawer.py` (manual close button) - Create: `tests/test_scene_close.py` **Test:** simulate prose "we drove to the park"; assert classifier returns `container_change=true`; assert `scene_closed` then `scene_opened` events written. **Commit:** `feat: scene close on hard signals with manual override` --- ### Task 27: Per-POV summary on close On `scene_closed`, classifier writes a per-POV summary for each present witness (Phase 1: just the host bot since we're single-bot). Updates the existing memory rows for that scene, replacing terse pov_summary with a proper scene-level summary. Updates edge `summary` from the per-POV summary + prior summary. Promotion rules apply (§11.3). **Files:** - Create: `chat/services/scene_summarize.py` - Modify: `chat/eventlog/projector.py` if needed for scene_closed handler - Create: `tests/test_per_pov_summary.py` **Commit:** `feat: per-POV summary and edge summary update on scene close` --- ## Phase 1G: Rollback ### Task 28: Rewind UI + impact preview + pre-rewind snapshot "Rewind to here" button on each turn in the chat. Computes impact preview (count messages, scene transitions, edge updates, memories, fired events affected). Pre-rewind snapshot written to `data/snapshots/rewind/`. On confirm: truncate event_log past selected event, drop projected tables, replay events up to selected. 30-second undo toast. **Files:** - Create: `chat/services/rewind.py` - Create: `chat/services/snapshot.py` - Create: `chat/templates/rewind_modal.html` - Modify: `chat/web/turns.py` - Create: `tests/test_rewind.py` **Test:** play 5 turns; rewind to turn 2; assert events 3-5 removed, projected state matches state-at-turn-2, snapshot file exists. **Commit:** `feat: rewind with impact preview, pre-rewind snapshot, undo toast` --- ### Task 29: Regenerate (inline edit-then-regenerate) Button on the latest assistant_turn. Click puts your prior user_turn into inline edit mode; submit either appends `user_turn_edit` (if edited) then a new `assistant_turn`, or just a new `assistant_turn` (if not edited). The previous `assistant_turn` is marked `superseded_by` the new one. Display hides superseded turns. **Files:** - Create: `chat/services/regenerate.py` - Modify: `chat/web/turns.py` - Modify: `chat/templates/chat.html` (regenerate button + edit-state HTMX swaps) - Create: `tests/test_regenerate.py` **Test:** regenerate without edit → new `assistant_turn`, prior superseded, projected state reflects new only. With edit → also a `user_turn_edit` event. **Commit:** `feat: regenerate with edit-then-regenerate inline UX` --- ### Task 30: Reset bot (hard confirm) `/bots//reset` → modal requiring you to type the bot's name. On confirm: emit `bot_reset` event. Handler purges the bot's chat_state, scenes, containers, activities, memories, edges-involving-this-bot. Identity, initial-relationship, kickoff prose preserved. Chat sits ready (no auto kickoff replay; next user message triggers it). **Files:** - Create: `chat/services/reset.py` - Modify: `chat/web/bots.py` - Modify: `chat/templates/bot_list.html` (reset button) - Create: `tests/test_reset.py` **Test:** play, reset, assert all transient state for that bot is gone, identity remains. **Commit:** `feat: bot reset with hard confirm and event-driven purge` --- ## Phase 1H: Ops & polish ### Task 31: Periodic snapshots Every 100 events OR every 30 minutes since last snapshot, write a full-state JSON to `data/snapshots/periodic/`. Retain last 5. On cold load (app start), if a periodic snapshot exists, apply it then replay events past it. **Files:** - Modify: `chat/services/snapshot.py` - Modify: `chat/services/background.py` (periodic timer) - Create: `tests/test_snapshot.py` **Commit:** `feat: periodic snapshots with retention and cold-load fast-path` --- ### Task 32: Nightly backups Simple in-process scheduler: at 03:00 local time daily, copy `chat.db` to `data/backups/chat-.db`. Retain last 14. Suitable for v1; launchd plist can replace later. **Files:** - Create: `chat/services/backup.py` - Modify: `chat/services/background.py` - Create: `tests/test_backup.py` **Commit:** `feat: nightly DB backups with 14-day retention` --- ### Task 33: Display formatting Renderer for transcript turns. Lightweight markdown (paragraphs, italic, bold, blockquotes — no headings/code). `*action*` rendered as italic in narrative output. OOC `((parens))` rendered dimmed/italic/smaller, never sent to bot. Speaker labels bold. **Files:** - Create: `chat/web/render.py` - Modify: `chat/templates/chat.html` (use render filters) - Modify: `chat/static/app.css` - Create: `tests/test_render.py` **Test:** input prose with all marker types → expected HTML output. **Commit:** `feat: transcript display formatting with markdown and OOC styling` --- ### Task 34: Streaming UX (typing indicator, Stop, mid-stream disconnect) Stop button on streaming bot row aborts the in-flight Featherless request and commits partial as `assistant_turn` with `truncated=true`. SSE client handles disconnect: server detects channel close, commits whatever was streamed, surfaces "connection lost — partial response saved" banner with Regenerate button. Send button disabled while streaming. **Files:** - Modify: `chat/web/turns.py` - Modify: `chat/templates/chat.html` - Modify: `chat/static/app.css` - Create: `tests/test_streaming_ux.py` **Commit:** `feat: streaming UX with Stop, disconnect handling, send-lock` --- ### Task 35: Error UX banners + first-run flow Error banners (per §16.5): Featherless 401/429/5xx surface inline with Retry. DB write failures show modal-blocking error. Schema migration failure on startup logs to stderr and exits non-zero. First-run flow: if `you_entity` missing, redirect to `/settings` after first navigation. If `bots` empty, after settings save, redirect to `/bots/new`. After bot creation + kickoff confirm, land in chat. **Files:** - Create: `chat/web/middleware.py` (first-run redirect) - Create: `chat/templates/errors.html` - Modify: `chat/web/turns.py` (catch Featherless errors) - Modify: `chat/app.py` (mount middleware, error handlers) - Create: `tests/test_first_run.py` - Create: `tests/test_error_ux.py` **Commit:** `feat: error banners and first-run navigation flow` --- ## Wrap-up After T35, run the full test suite and a manual smoke test: ```bash pytest -v uvicorn chat.app:app --reload # In a browser: walk through first-run, author a bot with kickoff, # play 10 turns, open the drawer, edit an edge, close a scene, rewind, regenerate. # Open a second tab on the same chat, verify multi-tab sync. ``` Update CLAUDE.md to reflect the v1 surface that actually shipped (any tasks deferred to Phase 1.5, any choices that shifted during implementation). Merge `phase-1` into `main` with a single squash commit referencing this plan. --- ## Notes for the executor - **Verify before claiming done** (`superpowers-extended-cc:verification-before-completion`): every task ends with running its test command and reading the output. "Tests should pass" is not enough; show the green output. - **DRY ruthlessly** but don't pre-extract: if two tasks need similar code, inline both first, then refactor in a third commit. Premature abstraction breaks the TDD rhythm. - **YAGNI**: don't pre-build for Phase 2 (multi-bot, guests, group node) until those tasks exist. - **Frequent commits**: one per task minimum, more if a task naturally splits. - **Don't bypass the event log.** Any state change goes through an event. If a test wants to seed state directly, it's still appending events and projecting — not `INSERT INTO bots` directly. (Exception: schema migrations themselves.) - **API key safety**: never log the Featherless API key, never write it to event payloads, never include it in error messages.