Files
chat/docs/plans/2026-04-26-v1-phase1-implementation.md
T

50 KiB
Raw Blame History

Roleplay Engine — Phase 1 Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.

Goal: Build the v1 (Phase 1) roleplay engine end-to-end — a local-first FastAPI + HTMX app with single-bot chats, persistent bot-owned memory, per-chat clocks, an event-sourced SQLite backend, multi-tab SSE streaming, drawer state surface, rewind / regenerate / reset, and Featherless inference (narrative + classifier models).

Architecture: Python 3.11+ FastAPI server, SQLite (single file, WAL mode) projected from an append-only event log. Featherless OpenAI-compatible client behind a LLMClient interface. Per-chat in-process pub/sub queue broadcasts state changes over SSE to all subscribed browser tabs. State changes always go through events; the projector applies them. TDD: every task starts with a failing test.

Tech Stack:

  • Python 3.11+, FastAPI, Uvicorn, HTMX (CDN), Jinja2 templates, vanilla CSS.
  • SQLite (stdlib sqlite3), aiosqlite for async paths where useful.
  • pydantic for state schemas, pydantic-settings for config.
  • instructor (or Featherless-native JSON-mode) for classifier-constrained output via openai SDK pointed at https://api.featherless.ai/v1.
  • tiktoken for token accounting.
  • pytest, pytest-asyncio, httpx (for FastAPI TestClient), freezegun for time tests.

Source-of-truth references:

When a task says "see §X", that's the requirements doc unless stated otherwise.


Pre-flight

Worktree: This is a greenfield repo on main. Branch off into phase-1 before starting:

git checkout -b phase-1

Python env: Use a project-local venv (<repo>/.venv/). Add .venv/ and __pycache__/ to .gitignore in T0.

Featherless API key: Stored in data/config.toml (gitignored). The plan creates an example file in T1; you copy it and paste in your real key locally.

TDD discipline: Every task starts with a failing test. Don't skip step 2 ("run to verify it fails"). If the test passes before implementation, the test is wrong — fix the test first.

Commit cadence: One commit per task. Commit messages use feat:, chore:, test:, docs: prefixes.

Verification before claiming done: Use superpowers-extended-cc:verification-before-completion — run the test command and read its actual output. Do not claim a task complete on hope.


Phase 1A: Foundation

Task 0: Project skeleton

Files:

  • Create: pyproject.toml
  • Create: .python-version
  • Create: chat/__init__.py
  • Create: chat/app.py
  • Create: tests/__init__.py
  • Create: tests/test_health.py
  • Modify: .gitignore (add .venv/, __pycache__/, *.pyc, .pytest_cache/)

Step 1: Write the failing test

# tests/test_health.py
from fastapi.testclient import TestClient
from chat.app import app

def test_health_endpoint_returns_ok():
    client = TestClient(app)
    response = client.get("/health")
    assert response.status_code == 200
    assert response.json() == {"status": "ok"}

Step 2: Run test to verify it fails

python -m venv .venv && source .venv/bin/activate
pip install fastapi uvicorn[standard] httpx pytest pytest-asyncio
pytest tests/test_health.py -v

Expected: ImportError on chat.app (module doesn't exist).

Step 3: Write minimal implementation

# chat/app.py
from fastapi import FastAPI

app = FastAPI(title="chat")

@app.get("/health")
def health():
    return {"status": "ok"}

pyproject.toml minimum:

[project]
name = "chat"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = [
    "fastapi>=0.110",
    "uvicorn[standard]>=0.30",
    "httpx>=0.27",
    "pydantic>=2.6",
    "pydantic-settings>=2.2",
    "openai>=1.30",
    "instructor>=1.3",
    "tiktoken>=0.7",
    "jinja2>=3.1",
    "aiosqlite>=0.20",
]

[project.optional-dependencies]
dev = ["pytest>=8", "pytest-asyncio>=0.23", "freezegun>=1.4"]

[tool.pytest.ini_options]
pythonpath = ["."]
asyncio_mode = "auto"

Step 4: Run test to verify it passes

pip install -e .[dev]
pytest tests/test_health.py -v

Expected: 1 passed.

Step 5: Commit

git add pyproject.toml .python-version chat/ tests/ .gitignore
git commit -m "feat: project skeleton with health endpoint"

Task 1: Config loading

Loads data/config.toml, honors CHAT_DB_PATH env var override, exposes a Settings pydantic model. See requirements §3 / §12.

Files:

  • Create: chat/config.py
  • Create: data/config.example.toml
  • Create: tests/test_config.py

Step 1: Write the failing test

# tests/test_config.py
import os
from pathlib import Path
import pytest
from chat.config import load_settings

def test_load_settings_reads_toml(tmp_path, monkeypatch):
    cfg = tmp_path / "config.toml"
    cfg.write_text("""
        featherless_api_key = "sk-test"
        narrative_model = "dphn/Dolphin-Mistral-24B-Venice-Edition"
        classifier_model = "NousResearch/Hermes-3-Llama-3.1-8B"
        ooc_marker = "(("
        retrieval_k = 4
    """)
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
    s = load_settings()
    assert s.featherless_api_key == "sk-test"
    assert s.narrative_model.startswith("dphn/")
    assert s.retrieval_k == 4

def test_chat_db_path_env_overrides_default(tmp_path, monkeypatch):
    monkeypatch.setenv("CHAT_DB_PATH", str(tmp_path / "alt.db"))
    monkeypatch.setenv("CHAT_CONFIG_PATH", str(tmp_path / "config.toml"))
    (tmp_path / "config.toml").write_text('featherless_api_key = "x"\n')
    s = load_settings()
    assert s.db_path == tmp_path / "alt.db"

Step 2: Run test to verify it fails

pytest tests/test_config.py -v

Expected: ImportError or AttributeError.

Step 3: Write minimal implementation

# chat/config.py
from __future__ import annotations
import os
import tomllib
from pathlib import Path
from pydantic import BaseModel, Field

REPO_ROOT = Path(__file__).resolve().parent.parent
DEFAULT_CONFIG = REPO_ROOT / "data" / "config.toml"
DEFAULT_DB = REPO_ROOT / "data" / "chat.db"

class Settings(BaseModel):
    featherless_api_key: str
    featherless_base_url: str = "https://api.featherless.ai/v1"
    narrative_model: str = "dphn/Dolphin-Mistral-24B-Venice-Edition"
    classifier_model: str = "NousResearch/Hermes-3-Llama-3.1-8B"
    classifier_fallbacks: list[str] = Field(
        default_factory=lambda: [
            "cognitivecomputations/dolphin-2.9.4-llama3-8b",
            "mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated",
        ]
    )
    ooc_marker: str = "(("
    retrieval_k: int = 4
    narrative_budget_hard: int = 8000
    narrative_budget_soft: int = 6000
    classifier_budget_hard: int = 4000
    classifier_timeout_s: float = 10.0
    db_path: Path = DEFAULT_DB
    data_dir: Path = REPO_ROOT / "data"
    bind_host: str = "127.0.0.1"
    bind_port: int = 8000

def load_settings() -> Settings:
    config_path = Path(os.environ.get("CHAT_CONFIG_PATH", DEFAULT_CONFIG))
    raw: dict = {}
    if config_path.exists():
        raw = tomllib.loads(config_path.read_text())
    if "CHAT_DB_PATH" in os.environ:
        raw["db_path"] = Path(os.environ["CHAT_DB_PATH"])
    return Settings(**raw)

data/config.example.toml:

# Copy this file to data/config.toml and fill in your API key.
featherless_api_key = "REPLACE_ME"
narrative_model = "dphn/Dolphin-Mistral-24B-Venice-Edition"
classifier_model = "NousResearch/Hermes-3-Llama-3.1-8B"
ooc_marker = "(("
retrieval_k = 4

Step 4: Run test to verify it passes

pytest tests/test_config.py -v

Expected: 2 passed.

Step 5: Commit

git add chat/config.py data/config.example.toml tests/test_config.py
git commit -m "feat: config loader with toml + env override"

Task 2: SQLite migrations framework

Establishes a forward-only migration runner reading SQL files from chat/db/migrations/, tracked in a meta table (key/value).

Files:

  • Create: chat/db/__init__.py
  • Create: chat/db/connection.py
  • Create: chat/db/migrate.py
  • Create: chat/db/migrations/0001_init_meta.sql
  • Create: tests/test_migrate.py

Step 1: Write the failing test

# tests/test_migrate.py
from chat.db.connection import open_db
from chat.db.migrate import apply_migrations

def test_apply_migrations_creates_meta_table(tmp_path):
    db = tmp_path / "test.db"
    apply_migrations(db)
    with open_db(db) as conn:
        row = conn.execute(
            "SELECT value FROM meta WHERE key = 'schema_version'"
        ).fetchone()
        assert row is not None
        assert int(row[0]) >= 1

def test_apply_migrations_idempotent(tmp_path):
    db = tmp_path / "test.db"
    apply_migrations(db)
    apply_migrations(db)  # second call must be a no-op
    with open_db(db) as conn:
        count = conn.execute("SELECT COUNT(*) FROM meta").fetchone()[0]
        assert count == 1

Step 2: Run test to verify it fails

pytest tests/test_migrate.py -v

Expected: ImportError.

Step 3: Write minimal implementation

# chat/db/connection.py
from __future__ import annotations
import sqlite3
from contextlib import contextmanager
from pathlib import Path

@contextmanager
def open_db(path: Path):
    path.parent.mkdir(parents=True, exist_ok=True)
    conn = sqlite3.connect(path)
    conn.execute("PRAGMA journal_mode=WAL")
    conn.execute("PRAGMA foreign_keys=ON")
    try:
        yield conn
        conn.commit()
    finally:
        conn.close()
# chat/db/migrate.py
from __future__ import annotations
from pathlib import Path
from chat.db.connection import open_db

MIGRATIONS_DIR = Path(__file__).parent / "migrations"

def apply_migrations(db_path: Path) -> None:
    with open_db(db_path) as conn:
        conn.execute(
            "CREATE TABLE IF NOT EXISTS meta (key TEXT PRIMARY KEY, value TEXT)"
        )
        cur = conn.execute("SELECT value FROM meta WHERE key = 'schema_version'")
        row = cur.fetchone()
        current = int(row[0]) if row else 0
        for path in sorted(MIGRATIONS_DIR.glob("*.sql")):
            version = int(path.stem.split("_", 1)[0])
            if version <= current:
                continue
            sql = path.read_text()
            conn.executescript(sql)
            conn.execute(
                "INSERT OR REPLACE INTO meta (key, value) VALUES ('schema_version', ?)",
                (str(version),),
            )
-- chat/db/migrations/0001_init_meta.sql
-- meta table is created by the migrate runner; this migration is a marker.
SELECT 1;

Step 4: Run test to verify it passes

pytest tests/test_migrate.py -v

Expected: 2 passed.

Step 5: Commit

git add chat/db/ tests/test_migrate.py
git commit -m "feat: sqlite migration runner with meta version table"

Task 3: Featherless client with mock

Defines LLMClient protocol with generate(messages, params, stream=False) and generate_structured(messages, schema). Implementations: FeatherlessClient (real), MockLLMClient (test).

Files:

  • Create: chat/llm/__init__.py
  • Create: chat/llm/client.py
  • Create: chat/llm/featherless.py
  • Create: chat/llm/mock.py
  • Create: tests/test_llm_mock.py

Step 1: Write the failing test

# tests/test_llm_mock.py
import pytest
from chat.llm.mock import MockLLMClient
from chat.llm.client import Message

@pytest.mark.asyncio
async def test_mock_returns_canned_response():
    client = MockLLMClient(canned=["Hello, world."])
    msgs = [Message(role="user", content="hi")]
    out = await client.generate(msgs, model="any")
    assert out == "Hello, world."

@pytest.mark.asyncio
async def test_mock_streams_tokens():
    client = MockLLMClient(canned=["abcd"])
    msgs = [Message(role="user", content="hi")]
    chunks = []
    async for chunk in client.stream(msgs, model="any"):
        chunks.append(chunk)
    assert "".join(chunks) == "abcd"

Step 2: Run test to verify it fails

pytest tests/test_llm_mock.py -v

Expected: ImportError.

Step 3: Write minimal implementation

# chat/llm/client.py
from __future__ import annotations
from dataclasses import dataclass
from typing import Protocol, AsyncIterator, Sequence

@dataclass
class Message:
    role: str  # "system" | "user" | "assistant"
    content: str

class LLMClient(Protocol):
    async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str: ...
    def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]: ...
# chat/llm/mock.py
from __future__ import annotations
from typing import AsyncIterator, Sequence
from .client import Message

class MockLLMClient:
    def __init__(self, canned: list[str]):
        self._canned = list(canned)
    async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str:
        return self._canned.pop(0)
    async def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]:
        text = self._canned.pop(0)
        for ch in text:
            yield ch
# chat/llm/featherless.py
from __future__ import annotations
from typing import AsyncIterator, Sequence
from openai import AsyncOpenAI
from .client import Message

class FeatherlessClient:
    def __init__(self, api_key: str, base_url: str = "https://api.featherless.ai/v1"):
        self._client = AsyncOpenAI(api_key=api_key, base_url=base_url)
    async def generate(self, messages: Sequence[Message], *, model: str, **params) -> str:
        resp = await self._client.chat.completions.create(
            model=model,
            messages=[{"role": m.role, "content": m.content} for m in messages],
            **params,
        )
        return resp.choices[0].message.content or ""
    async def stream(self, messages: Sequence[Message], *, model: str, **params) -> AsyncIterator[str]:
        stream = await self._client.chat.completions.create(
            model=model,
            messages=[{"role": m.role, "content": m.content} for m in messages],
            stream=True,
            **params,
        )
        async for chunk in stream:
            delta = chunk.choices[0].delta.content or ""
            if delta:
                yield delta

Step 4: Run test to verify it passes

pytest tests/test_llm_mock.py -v

Expected: 2 passed.

Step 5: Commit

git add chat/llm/ tests/test_llm_mock.py
git commit -m "feat: LLMClient protocol with Featherless and mock implementations"

Task 4: Classifier service wrapper

Wraps the classifier model with retry, timeout, and Pydantic-constrained output (per requirements §3.3). Falls back to schema-default on persistent failure. Logs failures to classifier_failures table.

Files:

  • Create: chat/db/migrations/0002_classifier_failures.sql
  • Create: chat/llm/classify.py
  • Create: tests/test_classify.py

Step 1: Write the failing test

# tests/test_classify.py
import pytest
from pydantic import BaseModel
from chat.llm.mock import MockLLMClient
from chat.llm.classify import classify

class Verdict(BaseModel):
    score: int
    reason: str

@pytest.mark.asyncio
async def test_classify_parses_valid_json():
    mock = MockLLMClient(canned=['{"score": 2, "reason": "notable"}'])
    result = await classify(mock, model="m", system="x", user="y", schema=Verdict)
    assert result.score == 2

@pytest.mark.asyncio
async def test_classify_falls_back_on_unparseable_after_retry():
    mock = MockLLMClient(canned=["nope", "still nope"])
    default = Verdict(score=1, reason="fallback")
    result = await classify(mock, model="m", system="x", user="y", schema=Verdict, default=default)
    assert result.reason == "fallback"

Step 2: Run test to verify it fails

pytest tests/test_classify.py -v

Expected: ImportError.

Step 3: Write minimal implementation

chat/db/migrations/0002_classifier_failures.sql:

CREATE TABLE classifier_failures (
    id INTEGER PRIMARY KEY,
    kind TEXT NOT NULL,
    model TEXT NOT NULL,
    raw_text TEXT,
    attempt_count INTEGER NOT NULL,
    created_at TEXT NOT NULL DEFAULT (datetime('now'))
);

chat/llm/classify.py:

from __future__ import annotations
import json
import asyncio
from typing import TypeVar
from pydantic import BaseModel, ValidationError
from .client import LLMClient, Message

T = TypeVar("T", bound=BaseModel)

REFUSAL_PATTERNS = ("i can't", "i cannot", "i'm sorry, but", "as an ai")

async def classify(
    client: LLMClient,
    *,
    model: str,
    system: str,
    user: str,
    schema: type[T],
    default: T | None = None,
    timeout_s: float = 10.0,
) -> T:
    msgs = [
        Message(role="system", content=system + "\n\nRespond with JSON only matching the schema."),
        Message(role="user", content=user),
    ]
    for attempt in range(2):
        try:
            text = await asyncio.wait_for(
                client.generate(msgs, model=model, response_format={"type": "json_object"}),
                timeout=timeout_s,
            )
            if any(p in text.lower()[:80] for p in REFUSAL_PATTERNS) and not text.strip().startswith("{"):
                raise ValueError("refusal-shaped response")
            return schema.model_validate_json(text)
        except (ValidationError, ValueError, json.JSONDecodeError, asyncio.TimeoutError):
            msgs[0] = Message(role="system", content=system + "\n\nRespond with valid JSON ONLY. No prose.")
            continue
    if default is None:
        raise RuntimeError(f"classify failed for schema {schema.__name__} with no default")
    return default

Step 4: Run test to verify it passes

pytest tests/test_classify.py -v

Expected: 2 passed.

Step 5: Commit

git add chat/llm/classify.py chat/db/migrations/0002_classifier_failures.sql tests/test_classify.py
git commit -m "feat: classifier wrapper with retry, timeout, schema-default fallback"

Phase 1B: Event log & state machine

Task 5: Event log + projector skeleton

Append-only event log with one row per event (id, branch_id, ts, kind, payload_json). Projector framework that dispatches per-kind handlers; initial registry is empty. State changes ALWAYS go through append_event.

Files:

  • Create: chat/db/migrations/0003_event_log.sql
  • Create: chat/eventlog/__init__.py
  • Create: chat/eventlog/log.py
  • Create: chat/eventlog/projector.py
  • Create: tests/test_eventlog.py

Step 1: Write the failing test

# tests/test_eventlog.py
from chat.db.migrate import apply_migrations
from chat.db.connection import open_db
from chat.eventlog.log import append_event, read_events

def test_append_and_read(tmp_path):
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        eid = append_event(conn, kind="test_kind", payload={"a": 1})
        assert eid > 0
        rows = list(read_events(conn))
        assert len(rows) == 1
        assert rows[0].kind == "test_kind"
        assert rows[0].payload["a"] == 1

Step 2: Run test to verify it fails

Expected: missing migration / module.

Step 3: Write minimal implementation

chat/db/migrations/0003_event_log.sql:

CREATE TABLE event_log (
    id INTEGER PRIMARY KEY,
    branch_id INTEGER NOT NULL DEFAULT 1,
    ts TEXT NOT NULL DEFAULT (datetime('now')),
    kind TEXT NOT NULL,
    payload_json TEXT NOT NULL,
    superseded_by INTEGER REFERENCES event_log(id),
    hidden INTEGER NOT NULL DEFAULT 0
);
CREATE INDEX idx_event_log_branch_kind ON event_log(branch_id, kind);

chat/eventlog/log.py:

from __future__ import annotations
import json
from dataclasses import dataclass
from typing import Any, Iterator
from sqlite3 import Connection

@dataclass
class Event:
    id: int
    branch_id: int
    ts: str
    kind: str
    payload: dict[str, Any]
    superseded_by: int | None
    hidden: bool

def append_event(conn: Connection, *, kind: str, payload: dict[str, Any], branch_id: int = 1) -> int:
    cur = conn.execute(
        "INSERT INTO event_log (branch_id, kind, payload_json) VALUES (?, ?, ?)",
        (branch_id, kind, json.dumps(payload)),
    )
    return cur.lastrowid

def read_events(conn: Connection, branch_id: int = 1, after_id: int = 0) -> Iterator[Event]:
    cur = conn.execute(
        "SELECT id, branch_id, ts, kind, payload_json, superseded_by, hidden "
        "FROM event_log WHERE branch_id = ? AND id > ? AND hidden = 0 "
        "AND superseded_by IS NULL ORDER BY id",
        (branch_id, after_id),
    )
    for row in cur:
        yield Event(
            id=row[0], branch_id=row[1], ts=row[2], kind=row[3],
            payload=json.loads(row[4]), superseded_by=row[5], hidden=bool(row[6]),
        )

chat/eventlog/projector.py:

from __future__ import annotations
from collections.abc import Callable
from sqlite3 import Connection
from .log import Event, read_events

Handler = Callable[[Connection, Event], None]
_REGISTRY: dict[str, Handler] = {}

def on(kind: str):
    def deco(fn: Handler) -> Handler:
        _REGISTRY[kind] = fn
        return fn
    return deco

def project(conn: Connection, branch_id: int = 1) -> None:
    for event in read_events(conn, branch_id=branch_id):
        h = _REGISTRY.get(event.kind)
        if h:
            h(conn, event)

def apply_event(conn: Connection, event: Event) -> None:
    h = _REGISTRY.get(event.kind)
    if h:
        h(conn, event)

Step 4: Run test to verify it passes

pytest tests/test_eventlog.py -v

Expected: 1 passed.

Step 5: Commit

git add chat/eventlog/ chat/db/migrations/0003_event_log.sql tests/test_eventlog.py
git commit -m "feat: append-only event log with projector skeleton"

Task 6: Bot + You entity schemas and events

Adds bots and you_entity projected tables, bot_authored and you_authored event kinds. Identity is immutable per session — re-authoring writes a new event.

Files:

  • Create: chat/db/migrations/0004_entities.sql
  • Create: chat/state/__init__.py
  • Create: chat/state/entities.py
  • Modify: chat/eventlog/projector.py (import handlers)
  • Create: tests/test_entities.py

Step 1: Write the failing test

# tests/test_entities.py
from chat.db.migrate import apply_migrations
from chat.db.connection import open_db
from chat.eventlog.log import append_event
from chat.eventlog.projector import project
from chat.state.entities import get_bot, list_bots, get_you
import chat.state.entities  # registers handlers

def test_bot_authored_creates_bot_row(tmp_path):
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        append_event(conn, kind="bot_authored", payload={
            "id": "bot_a", "name": "BotA",
            "persona": "...", "voice_samples": ["sample"], "traits": ["shy"],
            "backstory": "...",
            "initial_relationship_to_you": "coworker",
            "kickoff_prose": "you stay late",
        })
        project(conn)
        bot = get_bot(conn, "bot_a")
        assert bot is not None
        assert bot["name"] == "BotA"
        assert bot["traits"] == ["shy"]
        assert "bot_a" in [b["id"] for b in list_bots(conn)]

def test_you_authored_creates_you_singleton(tmp_path):
    db = tmp_path / "t.db"
    apply_migrations(db)
    with open_db(db) as conn:
        append_event(conn, kind="you_authored", payload={
            "name": "Me", "pronouns": "they/them", "persona": "engineer",
        })
        project(conn)
        you = get_you(conn)
        assert you is not None
        assert you["name"] == "Me"

Step 2: Run, verify fail.

Step 3: Implementation.

chat/db/migrations/0004_entities.sql:

CREATE TABLE bots (
    id TEXT PRIMARY KEY,
    name TEXT NOT NULL,
    persona TEXT NOT NULL,
    voice_samples_json TEXT NOT NULL DEFAULT '[]',
    traits_json TEXT NOT NULL DEFAULT '[]',
    backstory TEXT NOT NULL DEFAULT '',
    initial_relationship_to_you TEXT NOT NULL DEFAULT '',
    kickoff_prose TEXT NOT NULL DEFAULT '',
    created_at TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE TABLE you_entity (
    id INTEGER PRIMARY KEY CHECK (id = 1),
    name TEXT NOT NULL,
    pronouns TEXT NOT NULL DEFAULT '',
    persona TEXT NOT NULL DEFAULT ''
);

chat/state/entities.py:

from __future__ import annotations
import json
from sqlite3 import Connection
from chat.eventlog.projector import on
from chat.eventlog.log import Event

@on("bot_authored")
def _apply_bot_authored(conn: Connection, e: Event) -> None:
    p = e.payload
    conn.execute(
        "INSERT OR REPLACE INTO bots "
        "(id, name, persona, voice_samples_json, traits_json, backstory, "
        " initial_relationship_to_you, kickoff_prose) "
        "VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
        (p["id"], p["name"], p["persona"],
         json.dumps(p.get("voice_samples", [])),
         json.dumps(p.get("traits", [])),
         p.get("backstory", ""),
         p.get("initial_relationship_to_you", ""),
         p.get("kickoff_prose", "")),
    )

@on("you_authored")
def _apply_you_authored(conn: Connection, e: Event) -> None:
    p = e.payload
    conn.execute(
        "INSERT OR REPLACE INTO you_entity (id, name, pronouns, persona) VALUES (1, ?, ?, ?)",
        (p["name"], p.get("pronouns", ""), p.get("persona", "")),
    )

def get_bot(conn: Connection, bot_id: str) -> dict | None:
    row = conn.execute("SELECT * FROM bots WHERE id = ?", (bot_id,)).fetchone()
    if not row:
        return None
    cols = [c[1] for c in conn.execute("PRAGMA table_info(bots)").fetchall()]
    d = dict(zip(cols, row))
    d["voice_samples"] = json.loads(d.pop("voice_samples_json"))
    d["traits"] = json.loads(d.pop("traits_json"))
    return d

def list_bots(conn: Connection) -> list[dict]:
    cur = conn.execute("SELECT id, name FROM bots ORDER BY name")
    return [{"id": r[0], "name": r[1]} for r in cur]

def get_you(conn: Connection) -> dict | None:
    row = conn.execute("SELECT name, pronouns, persona FROM you_entity WHERE id = 1").fetchone()
    if not row:
        return None
    return {"name": row[0], "pronouns": row[1], "persona": row[2]}

Step 4: Run, verify pass.

Step 5: Commit.

git add chat/db/migrations/0004_entities.sql chat/state/ tests/test_entities.py
git commit -m "feat: bot and you entity schemas with projector handlers"

Task 7: Edges schema + per-turn deltas

Per requirements §3.4. Edges table holds per-pair directed state. edge_update event applies deltas (affinity, trust, knowledge_facts, last_interaction). Summary rewrites are a separate event kind written at scene close (T27).

Files:

  • Create: chat/db/migrations/0005_edges.sql
  • Create: chat/state/edges.py
  • Create: tests/test_edges.py

Test sketch:

def test_edge_update_applies_affinity_delta(tmp_path):
    # bot_authored, you_authored, then edge_update with affinity_delta=+5
    # assert edges row exists with affinity=initial+5

Implementation sketch:

CREATE TABLE edges (
    id INTEGER PRIMARY KEY,
    chat_id TEXT,                       -- null for default initial seed
    source_id TEXT NOT NULL,
    target_id TEXT NOT NULL,
    affinity INTEGER NOT NULL DEFAULT 50,
    trust INTEGER NOT NULL DEFAULT 50,
    summary TEXT NOT NULL DEFAULT '',
    knowledge_json TEXT NOT NULL DEFAULT '[]',
    last_interaction_chat_id TEXT,
    last_interaction_at TEXT,
    UNIQUE (source_id, target_id)
);
@on("edge_update")
def _apply_edge_update(conn, e):
    p = e.payload
    # upsert + apply deltas; clamp affinity/trust to 0..100
    # append knowledge_facts if any
    # bump last_interaction fields

Commit: feat: directed edges with per-turn delta projector


Task 8: Memory schema + witness flag

Memories are bot-owned. Witnessed-by mask stored per memory.

Files:

  • Create: chat/db/migrations/0006_memories.sql
  • Create: chat/state/memory.py
  • Create: tests/test_memory.py

Schema:

CREATE TABLE memories (
    id INTEGER PRIMARY KEY,
    owner_id TEXT NOT NULL,            -- bot id whose POV this is
    chat_id TEXT NOT NULL,
    scene_id INTEGER,
    pov_summary TEXT NOT NULL,
    witness_you INTEGER NOT NULL,
    witness_host INTEGER NOT NULL,
    witness_guest INTEGER NOT NULL,
    chat_clock_at TEXT,
    source TEXT,                        -- e.g. "direct" | "told_by:bot_id"
    reliability REAL NOT NULL DEFAULT 1.0,
    significance INTEGER NOT NULL DEFAULT 1,
    pinned INTEGER NOT NULL DEFAULT 0,
    auto_pinned INTEGER NOT NULL DEFAULT 0,
    created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX idx_memories_owner ON memories(owner_id);

-- FTS5 index on pov_summary, scoped by owner_id
CREATE VIRTUAL TABLE memories_fts USING fts5(
    pov_summary, content='memories', content_rowid='id'
);

CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN
    INSERT INTO memories_fts(rowid, pov_summary) VALUES (new.id, new.pov_summary);
END;
CREATE TRIGGER memories_au AFTER UPDATE ON memories BEGIN
    INSERT INTO memories_fts(memories_fts, rowid, pov_summary)
        VALUES('delete', old.id, old.pov_summary);
    INSERT INTO memories_fts(rowid, pov_summary) VALUES (new.id, new.pov_summary);
END;
CREATE TRIGGER memories_ad AFTER DELETE ON memories BEGIN
    INSERT INTO memories_fts(memories_fts, rowid, pov_summary)
        VALUES('delete', old.id, old.pov_summary);
END;

memory_written event handler + helper functions:

@on("memory_written")
def _apply_memory_written(conn, e): ...

def get_pinned(conn, owner_id) -> list[dict]: ...
def search_memories(conn, owner_id: str, witness_role: str, query: str, k: int = 4) -> list[dict]:
    """FTS5 search filtered by witness bit. witness_role in {'you','host','guest'}."""

Tests: write a memory event with witness [1,1,0], assert search returns it for owner; assert search filtered by witness_guest=1 excludes it.

Commit: feat: memory schema with witness flags and FTS5 index


Task 9: Activity, container, scene, chat schemas

Adds the per-chat structural tables: chats, chat_state, containers, scenes, activity. Plus event handlers for chat_created, container_created, activity_change, scene_opened, scene_closed.

Files:

  • Create: chat/db/migrations/0007_world.sql
  • Create: chat/state/world.py
  • Create: tests/test_world.py

Schema (key columns):

CREATE TABLE chats (
    id TEXT PRIMARY KEY,                -- e.g. "chat_botA"
    host_bot_id TEXT NOT NULL,
    guest_bot_id TEXT,                  -- null when no guest
    created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE TABLE chat_state (
    chat_id TEXT PRIMARY KEY,
    time TEXT NOT NULL,                 -- ISO 8601 UTC
    weather TEXT NOT NULL DEFAULT '',
    active_scene_id INTEGER,
    narrative_anchor TEXT               -- the in-fiction "Day 1 = ..." reference
);
CREATE TABLE containers (
    id INTEGER PRIMARY KEY,
    chat_id TEXT NOT NULL,
    name TEXT NOT NULL,
    type TEXT NOT NULL,
    properties_json TEXT NOT NULL DEFAULT '{}',
    parent_id INTEGER REFERENCES containers(id)
);
CREATE TABLE scenes (
    id INTEGER PRIMARY KEY,
    chat_id TEXT NOT NULL,
    container_id INTEGER REFERENCES containers(id),
    started_at TEXT NOT NULL,
    ended_at TEXT,
    significance INTEGER NOT NULL DEFAULT 0,
    participants_json TEXT NOT NULL DEFAULT '[]'
);
CREATE TABLE activity (
    entity_id TEXT PRIMARY KEY,         -- "you" or bot_id
    container_id INTEGER REFERENCES containers(id),
    slot TEXT,
    posture TEXT NOT NULL DEFAULT '',
    action_json TEXT NOT NULL DEFAULT '{}',
    attention TEXT NOT NULL DEFAULT '',
    holding_json TEXT NOT NULL DEFAULT '[]',
    status_json TEXT NOT NULL DEFAULT '{}',
    updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);

Handlers: chat_created, container_created, activity_change, scene_opened, scene_closed.

Tests: create chat → chat_state initialized; create container; activity_change updates activity row.

Commit: feat: chats, chat_state, containers, scenes, activity tables


Phase 1C: Authoring

Task 10: Kickoff prose parser

Classifier call that converts authored kickoff prose into structured {container, activity_per_entity, edge_seed} for confirmation.

Files:

  • Create: chat/services/kickoff.py
  • Create: tests/test_kickoff.py

Schema returned by classifier:

class KickoffParse(BaseModel):
    container_name: str
    container_type: str
    container_properties: dict           # moving, public, audible_range
    you_activity: ActivityShape
    bot_activity: ActivityShape
    initial_time_iso: str
    edge_seed_summary: str
    edge_seed_knowledge_facts: list[str]

class ActivityShape(BaseModel):
    posture: str
    action_verb: str
    action_interruptible: bool
    action_required_attention: str       # low|medium|high
    action_expected_duration: str
    attention: str = ""
    holding: list[str] = []

Implementation: call classify(...) with a prompt that includes the bot's persona + relationship-to-you + kickoff prose. Return the parsed model.

Test: mock client returns canned JSON; assert structured fields populate.

Commit: feat: kickoff prose parser via classifier


Task 11: Bot authoring page

Form-based authoring UI; on submit, validates and writes bot_authored event. After save, redirects to kickoff parse-and-confirm (T13).

Files:

  • Create: chat/templates/base.html
  • Create: chat/templates/bot_form.html
  • Create: chat/web/__init__.py
  • Create: chat/web/bots.py
  • Modify: chat/app.py (mount router, jinja env, static files)
  • Create: chat/static/app.css
  • Create: tests/test_bot_authoring.py

Test: POST to /bots/new with form fields; assert bot_authored event appended and bot row exists; response redirects to /bots/<id>/kickoff.

Implementation note: form fields map to identity per §5.1 (name, persona, voice_samples textarea split on ---, traits comma-separated, backstory, initial relationship to you, kickoff prose).

Commit: feat: bot authoring form with bot_authored event


Task 12: You-entity authoring (Settings page)

Single-row form for the "you" entity. Lives at /settings. POST writes you_authored event.

Files:

  • Create: chat/templates/settings.html
  • Create: chat/web/settings.py
  • Modify: chat/app.py
  • Create: tests/test_settings.py

Commit: feat: settings page with you-entity authoring


Task 13: Kickoff parse-and-confirm flow

After bot authoring, the user lands on /bots/<id>/kickoff which shows the parsed kickoff in editable form. On confirm: append chat_created, container_created, activity_change (per entity), scene_opened, and an initial edge_update (the seed).

Files:

  • Create: chat/templates/kickoff_confirm.html
  • Create: chat/web/kickoff.py
  • Create: tests/test_kickoff_confirm.py

Test: Submit a confirmed kickoff payload; assert chat exists, chat_state has time, container exists, activity rows present for you + bot, scene is open, edge has seed summary.

Commit: feat: kickoff parse-and-confirm flow with chat creation


Phase 1D: Chat — single bot

Task 14: Top-level nav + Chat list

Persistent left rail with three sections (§16.1). Chat list pulls from chats joined with chat_state and the latest assistant_turn for snippet.

Files:

  • Create: chat/templates/layout.html (extends base, adds rail)
  • Create: chat/templates/chat_list.html
  • Create: chat/templates/bot_list.html
  • Create: chat/web/nav.py
  • Modify: chat/app.py
  • Create: tests/test_chat_list.py

Commit: feat: top-level nav and chat list view


Task 15: Chat shell page

/chats/<id> — renders the empty timeline + input box + drawer toggle. No turn handling yet.

Files:

  • Create: chat/templates/chat.html
  • Create: chat/web/chat.py
  • Create: tests/test_chat_shell.py

Commit: feat: chat shell page rendering


Task 16: Per-chat SSE channel + multi-tab sync

In-process pub/sub: one asyncio.Queue per chat_id, broadcasting events to all subscribers. Endpoint /chats/<id>/events SSE-streams a JSON event stream. On connect, server pushes a snapshot event with current state; subsequent state changes push event items.

Files:

  • Create: chat/web/sse.py
  • Create: chat/web/pubsub.py
  • Modify: chat/web/chat.py
  • Create: tests/test_sse.py

Test: TestClient streams 1 event; assert framing is correct (event: snapshot\ndata: {...}\n\n).

Commit: feat: per-chat SSE channel and pub/sub


Task 17: Turn input parser

Classifier call that splits a user turn into [dialogue|action|ooc] segments. OOC segments stripped from prompt; flagged for transcript display only.

Files:

  • Create: chat/services/turn_parse.py
  • Create: tests/test_turn_parse.py

Schema:

class TurnSegment(BaseModel):
    kind: str                # dialogue|action|ooc
    text: str

class ParsedTurn(BaseModel):
    segments: list[TurnSegment]

Test: input *walks over* "Hey." ((player note)) → 3 segments tagged correctly. Mock classifier returns canned JSON.

Commit: feat: turn input parser via classifier


Task 18: Prompt assembly with trim tiers

Implements the must/should/nice trimming tiers (§3.2) for the narrative prompt. Token-counts via tiktoken. Inputs: speaker_id, current chat state, witnessed memories (top-K), recent dialogue, edges, activity for all present, active scene.

Files:

  • Create: chat/services/prompt.py
  • Create: tests/test_prompt.py

Test: stuff a huge dialogue history, assert older turns get summarized first (NICE), then memories drop to K=2, etc. Must-include never trimmed.

Commit: feat: prompt assembly with must/should/nice trim tiers


Task 19: Narrative call + streaming over SSE

POST /chats/<id>/turns accepts a user prose turn. Server:

  1. Appends user_turn event (raw + parsed segments).
  2. Appends a placeholder assistant_turn_started event.
  3. Streams narrative tokens over the chat's SSE channel as they arrive.
  4. On stream complete: appends assistant_turn event with full text + truncated=False.
  5. On stream interrupt: appends assistant_turn with truncated=True.

Files:

  • Create: chat/web/turns.py
  • Modify: chat/web/sse.py (add token broadcast)
  • Modify: chat/eventlog/log.py (add helpers if needed)
  • Create: tests/test_turn_flow.py

Test (uses MockLLMClient): POST a turn → assert SSE channel emits token chunks then a final assistant_turn event; DB has both events.

Commit: feat: narrative streaming via SSE with assistant_turn event


Phase 1E: State updates per turn

Task 20: Post-turn state-update pass

After narrative completes, classifier extracts affinity_delta, trust_delta, knowledge_facts per (source, target) directed pair, for every present entity (silent witnesses too). Emits edge_update events.

Files:

  • Create: chat/services/state_update.py
  • Create: tests/test_state_update.py

Test: mock returns deltas; assert edge_update events appended; projection updates affinity.

Commit: feat: post-turn state-update pass per present entity


Task 21: Memory write per turn

After narrative completes, write a memory row for each witness who's "owner" with appropriate witness flags. Phase 1 simplification: the memory's pov_summary is the assistant's narrative text snippet (significance default 1; classifier rewrites at scene close into per-POV summary form). Emits memory_written events.

Files:

  • Create: chat/services/memory_write.py
  • Create: tests/test_memory_write.py

Commit: feat: per-turn memory writes with witness flags


Task 22: Significance pass (queued, async)

Background task: after narrative completes, runs significance classifier (03 per §11.1) on the turn. Updates the just-written memory's significance. Auto-pins on score 3 (with the soft-cap eviction rule from §8.5).

Files:

  • Create: chat/services/significance.py
  • Create: chat/services/background.py (asyncio queue worker)
  • Modify: chat/app.py (lifespan starts/stops worker)
  • Create: tests/test_significance.py

Test: queue a significance job for a freshly-written memory; assert significance updates and auto-pin behavior on score 3.

Commit: feat: async significance pass with auto-pin on score 3


Task 23: Memory retrieval (FTS5, witness-filtered, top-K)

Implements search_memories(owner_id, witness_role, query, k) via FTS5 with WHERE filter on the witness column. Recency + significance boost in ranking.

Files:

  • Modify: chat/state/memory.py
  • Create: tests/test_memory_search.py

Test: seed memories with mixed witness flags; assert filter excludes non-witnessed; assert recency boost orders newer above older.

Commit: feat: FTS5 memory retrieval with witness filter and ranking boosts


Phase 1F: Drawer & state ops

Task 24: Drawer read-only skeleton

Right-side drawer rendered as a partial; HTMX-loaded into the chat page. Shows current scene, container, activity per entity, edges (host ↔ you), recent witnessed memories with significance markers, pinned memories with n/8 counter.

Files:

  • Create: chat/templates/drawer.html
  • Create: chat/web/drawer.py
  • Modify: chat/templates/chat.html (drawer toggle + container)
  • Modify: chat/static/app.css
  • Create: tests/test_drawer_render.py

Commit: feat: read-only drawer with scene, activity, edges, memories


Task 25: Drawer edits (activity / edges / memory)

Inline edit affordances on activity, edge fields, memory pov_summary/significance/pin. Each edit emits a manual_edit event with prior value snapshotted (per §6.4 final paragraph). Pin toggle emits memory_pin_changed event.

Files:

  • Modify: chat/web/drawer.py
  • Modify: chat/templates/drawer.html
  • Create: chat/state/manual_edit.py (handler for manual_edit event)
  • Create: tests/test_drawer_edits.py

Test: edit affinity slider via POST; assert manual_edit event written with prior + new value; projected affinity updated.

Commit: feat: drawer edits with manual_edit event capture


Task 26: Scene close (hard signals + manual button)

Hard-signal detection runs as a small classifier call after each turn (queued/cheap): does the prose indicate container change, explicit "we're done here" pattern, or other hard signal? Manual close button in drawer always available. On close, emit scene_closed event; reopen via scene_opened for the new scene.

Files:

  • Create: chat/services/scene_close.py
  • Modify: chat/web/turns.py
  • Modify: chat/web/drawer.py (manual close button)
  • Create: tests/test_scene_close.py

Test: simulate prose "we drove to the park"; assert classifier returns container_change=true; assert scene_closed then scene_opened events written.

Commit: feat: scene close on hard signals with manual override


Task 27: Per-POV summary on close

On scene_closed, classifier writes a per-POV summary for each present witness (Phase 1: just the host bot since we're single-bot). Updates the existing memory rows for that scene, replacing terse pov_summary with a proper scene-level summary. Updates edge summary from the per-POV summary + prior summary. Promotion rules apply (§11.3).

Files:

  • Create: chat/services/scene_summarize.py
  • Modify: chat/eventlog/projector.py if needed for scene_closed handler
  • Create: tests/test_per_pov_summary.py

Commit: feat: per-POV summary and edge summary update on scene close


Phase 1G: Rollback

Task 28: Rewind UI + impact preview + pre-rewind snapshot

"Rewind to here" button on each turn in the chat. Computes impact preview (count messages, scene transitions, edge updates, memories, fired events affected). Pre-rewind snapshot written to data/snapshots/rewind/. On confirm: truncate event_log past selected event, drop projected tables, replay events up to selected. 30-second undo toast.

Files:

  • Create: chat/services/rewind.py
  • Create: chat/services/snapshot.py
  • Create: chat/templates/rewind_modal.html
  • Modify: chat/web/turns.py
  • Create: tests/test_rewind.py

Test: play 5 turns; rewind to turn 2; assert events 3-5 removed, projected state matches state-at-turn-2, snapshot file exists.

Commit: feat: rewind with impact preview, pre-rewind snapshot, undo toast


Task 29: Regenerate (inline edit-then-regenerate)

Button on the latest assistant_turn. Click puts your prior user_turn into inline edit mode; submit either appends user_turn_edit (if edited) then a new assistant_turn, or just a new assistant_turn (if not edited). The previous assistant_turn is marked superseded_by the new one. Display hides superseded turns.

Files:

  • Create: chat/services/regenerate.py
  • Modify: chat/web/turns.py
  • Modify: chat/templates/chat.html (regenerate button + edit-state HTMX swaps)
  • Create: tests/test_regenerate.py

Test: regenerate without edit → new assistant_turn, prior superseded, projected state reflects new only. With edit → also a user_turn_edit event.

Commit: feat: regenerate with edit-then-regenerate inline UX


Task 30: Reset bot (hard confirm)

/bots/<id>/reset → modal requiring you to type the bot's name. On confirm: emit bot_reset event. Handler purges the bot's chat_state, scenes, containers, activities, memories, edges-involving-this-bot. Identity, initial-relationship, kickoff prose preserved. Chat sits ready (no auto kickoff replay; next user message triggers it).

Files:

  • Create: chat/services/reset.py
  • Modify: chat/web/bots.py
  • Modify: chat/templates/bot_list.html (reset button)
  • Create: tests/test_reset.py

Test: play, reset, assert all transient state for that bot is gone, identity remains.

Commit: feat: bot reset with hard confirm and event-driven purge


Phase 1H: Ops & polish

Task 31: Periodic snapshots

Every 100 events OR every 30 minutes since last snapshot, write a full-state JSON to data/snapshots/periodic/. Retain last 5. On cold load (app start), if a periodic snapshot exists, apply it then replay events past it.

Files:

  • Modify: chat/services/snapshot.py
  • Modify: chat/services/background.py (periodic timer)
  • Create: tests/test_snapshot.py

Commit: feat: periodic snapshots with retention and cold-load fast-path


Task 32: Nightly backups

Simple in-process scheduler: at 03:00 local time daily, copy chat.db to data/backups/chat-<timestamp>.db. Retain last 14. Suitable for v1; launchd plist can replace later.

Files:

  • Create: chat/services/backup.py
  • Modify: chat/services/background.py
  • Create: tests/test_backup.py

Commit: feat: nightly DB backups with 14-day retention


Task 33: Display formatting

Renderer for transcript turns. Lightweight markdown (paragraphs, italic, bold, blockquotes — no headings/code). *action* rendered as italic in narrative output. OOC ((parens)) rendered dimmed/italic/smaller, never sent to bot. Speaker labels bold.

Files:

  • Create: chat/web/render.py
  • Modify: chat/templates/chat.html (use render filters)
  • Modify: chat/static/app.css
  • Create: tests/test_render.py

Test: input prose with all marker types → expected HTML output.

Commit: feat: transcript display formatting with markdown and OOC styling


Task 34: Streaming UX (typing indicator, Stop, mid-stream disconnect)

Stop button on streaming bot row aborts the in-flight Featherless request and commits partial as assistant_turn with truncated=true. SSE client handles disconnect: server detects channel close, commits whatever was streamed, surfaces "connection lost — partial response saved" banner with Regenerate button. Send button disabled while streaming.

Files:

  • Modify: chat/web/turns.py
  • Modify: chat/templates/chat.html
  • Modify: chat/static/app.css
  • Create: tests/test_streaming_ux.py

Commit: feat: streaming UX with Stop, disconnect handling, send-lock


Task 35: Error UX banners + first-run flow

Error banners (per §16.5): Featherless 401/429/5xx surface inline with Retry. DB write failures show modal-blocking error. Schema migration failure on startup logs to stderr and exits non-zero.

First-run flow: if you_entity missing, redirect to /settings after first navigation. If bots empty, after settings save, redirect to /bots/new. After bot creation + kickoff confirm, land in chat.

Files:

  • Create: chat/web/middleware.py (first-run redirect)
  • Create: chat/templates/errors.html
  • Modify: chat/web/turns.py (catch Featherless errors)
  • Modify: chat/app.py (mount middleware, error handlers)
  • Create: tests/test_first_run.py
  • Create: tests/test_error_ux.py

Commit: feat: error banners and first-run navigation flow


Wrap-up

After T35, run the full test suite and a manual smoke test:

pytest -v
uvicorn chat.app:app --reload
# In a browser: walk through first-run, author a bot with kickoff,
# play 10 turns, open the drawer, edit an edge, close a scene, rewind, regenerate.
# Open a second tab on the same chat, verify multi-tab sync.

Update CLAUDE.md to reflect the v1 surface that actually shipped (any tasks deferred to Phase 1.5, any choices that shifted during implementation).

Merge phase-1 into main with a single squash commit referencing this plan.


Notes for the executor

  • Verify before claiming done (superpowers-extended-cc:verification-before-completion): every task ends with running its test command and reading the output. "Tests should pass" is not enough; show the green output.
  • DRY ruthlessly but don't pre-extract: if two tasks need similar code, inline both first, then refactor in a third commit. Premature abstraction breaks the TDD rhythm.
  • YAGNI: don't pre-build for Phase 2 (multi-bot, guests, group node) until those tasks exist.
  • Frequent commits: one per task minimum, more if a task naturally splits.
  • Don't bypass the event log. Any state change goes through an event. If a test wants to seed state directly, it's still appending events and projecting — not INSERT INTO bots directly. (Exception: schema migrations themselves.)
  • API key safety: never log the Featherless API key, never write it to event payloads, never include it in error messages.