Phase 2: multi-entity scene support (you + host + guest) #2
@@ -50,6 +50,10 @@ The 3-entity cap is load-bearing: it makes the relationship graph fully enumerab
|
||||
- **Snapshots**: periodic every 100 events / 30 min; pre-rewind always. 5 periodic retained; pre-rewind retained 14 days.
|
||||
- **Streaming**: Stop button on streaming row; mid-stream disconnect commits partial with `truncated: true`; Send disabled mid-stream; multi-tab streaming via per-chat SSE channel.
|
||||
- **Display**: lightweight markdown; `*action*` italic; OOC `((parens))` shown dimmed/italic, never sent to bot.
|
||||
- **Multi-entity defaults (Phase 2)**: when `chat.guest_bot_id is None`, behavior matches Phase 1 single-bot 1:1. With a guest, all 3 entities are present in the prompt, witness writes, and state-update fan-out (6 directed pairs).
|
||||
- **Addressee detection**: simple substring match (whole-word, case-insensitive) over the user turn's body. If both bot names match or neither does, the host gets the floor.
|
||||
- **Interjection**: classifier-driven, conservative bias (default false on classifier failure / refusal / parse error). When the classifier returns true, the addressee speaks first, then the non-addressee may interject in a follow-up turn.
|
||||
- **Per-POV summaries (multi-entity)**: each present witness with a memory store gets their own per-POV summary on scene close. The summary differs per bot based on persona + their edge to "you". The group node summary is updated alongside.
|
||||
|
||||
## Core concepts (vocabulary)
|
||||
|
||||
@@ -177,3 +181,28 @@ Small follow-ups identified during Phase 1 reviews. Pick up at any time; none ar
|
||||
- **`bot_reset` purges orphaned "you" activity rows** (see limitation above). Either delete `activity` rows by chat-membership or accept the noise indefinitely; the projection-layer fix is one extra `DELETE FROM activity WHERE entity_id='you' AND container_id IN (SELECT id FROM containers WHERE chat_id IN (...))` clause inside `_apply_bot_reset`.
|
||||
- **Drawer edits for the deferred v1 fields**: edge_trust slider, edge_summary textarea, memory pov_summary textarea, knowledge_facts add/remove. The `manual_edit` projector already supports `edge_trust` / `edge_summary` / `memory_pov_summary` target_kinds — only the routes are missing. Knowledge_facts needs a new dispatch branch.
|
||||
- **NICE trim order in prompt assembly** drops previous-scene first instead of last (T18 review). Greedy-cuts heuristic vs spec listing order; revisit if v1 play surfaces a real regression.
|
||||
|
||||
## Phase 2 status
|
||||
|
||||
Phase 2 shipped end-to-end across **13 tasks** (T36–T48 wave). The multi-entity surface is functional: chats can host a guest bot, the prompt assembly is guest-aware, post-turn fans out across all directed pairs, and scene close writes a per-POV summary per present witness plus a group_node summary.
|
||||
|
||||
- **Multi-entity scene support**: chats can now have a guest bot (you + host + guest). The 3-entity cap holds. New event kinds: `guest_added`, `guest_removed`, `group_node_initialized`, `group_node_updated`. New table: `group_node` (members, summary, dynamic, threads).
|
||||
- **Drawer guest UX**: add/remove guest from the drawer side panel. The "have they met?" prose seed is parsed by the `relationship_seed` classifier into inter-bot directed edges (host↔guest).
|
||||
- **Multi-entity turn flow**: `post_turn` assembles narrative with the guest-aware prompt; writes memories for **all** present bot witnesses; runs state updates for **all** directed pairs (6 with 3 entities); detects interjections via classifier (default false; the addressee gets the floor first).
|
||||
- **Per-POV scene close summaries**: each present witness with a memory store gets their own per-POV summary on close; `group_node` summary updated alongside.
|
||||
- **Bot reset cascade**: resetting a bot now also clears `chats.guest_bot_id` references in other chats (root-cause fix for stale-guest references after T47).
|
||||
|
||||
### Phase 2.5 / 3 backlog
|
||||
|
||||
Carry-overs from Phase 2 reviews and implementer notes. None are blocking; pick up at any time.
|
||||
|
||||
- **Interjection regenerate**: regenerate currently only acts on the addressee turn. Phase 2.5 should extend regenerate to cover the interjection turn too.
|
||||
- **Classifier-based addressee detection**: substring match is brittle (e.g., names that are common English words, or names appearing inside a quoted aside). A small classifier call could disambiguate.
|
||||
- **LLM-merged group meta-summary**: current `group_node.summary` is a naive concat of host + guest per-POV summaries. Phase 2.5 should polish with an LLM-merged group view.
|
||||
- **First-meeting gate**: the drawer's "have they met?" textarea fires every time. Phase 2.5 should check whether the host→guest edge already exists and offer a "they already know each other" toggle to skip re-seeding.
|
||||
- **Witness flag editing**: drawer doesn't allow editing memory witness flags (read-only). Phase 2.5+ may expose this.
|
||||
- **Significance for interjection memories**: the interjection's `memory_written` event doesn't enqueue a `SignificanceJob` (per the T44 implementer note). Phase 2.5 should wire this in so interjection memories are scored alongside primary turns.
|
||||
- **Stale guest reference defensive degrade in `post_turn`**: T44 added a degrade-to-1:1 when `chat.guest_bot_id` points at a deleted bot. T47 fixes the root cause (resets clear the reference); the degrade can probably be removed but is harmless.
|
||||
- **Scene close on cancel**: scene close runs even when the primary turn is cancelled. Behavior may be intentional but could be argued either way; revisit if it surfaces a real UX regression.
|
||||
- **Dual `ACTIVITIES:` block**: T43's prompt assembly adds a second `ACTIVITIES:` block for guest activity. Cleaner would be a single block with three bullets and per-bullet trim.
|
||||
- **Witness role hardcoded in prompt assembly**: `chat/services/prompt.py:436` hardcodes `witness_role="host"` regardless of which bot is speaking. Phase 2.5 should derive the role from chat membership (e.g. `"host" if speaker_bot_id == chat.host_bot_id else "guest"`) so guest-as-speaker prompts retrieve the right memory slice. Test contract pinned in `tests/test_witness_filter_multi.py`.
|
||||
|
||||
@@ -0,0 +1,8 @@
|
||||
CREATE TABLE group_node (
|
||||
chat_id TEXT PRIMARY KEY,
|
||||
members_json TEXT NOT NULL,
|
||||
summary TEXT NOT NULL DEFAULT '',
|
||||
dynamic TEXT NOT NULL DEFAULT '',
|
||||
threads_json TEXT NOT NULL DEFAULT '[]',
|
||||
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
|
||||
);
|
||||
@@ -0,0 +1,100 @@
|
||||
"""Interjection classifier service (T39).
|
||||
|
||||
Per Requirements §6.2, when a guest is present and the addressee bot has
|
||||
just spoken, the *non-addressee* bot may follow on with a brief
|
||||
interjection beat. This service decides whether that interjection
|
||||
fires. Conservative bias: most turns return ``should_interject=False``
|
||||
— the addressee has the floor and an interjection is the exception.
|
||||
Trigger ``True`` only when the silent witness's character, given their
|
||||
persona and edges, would plausibly speak up: jealousy, surprise, strong
|
||||
agreement worth voicing, correcting a factual falsehood, urgency.
|
||||
|
||||
T44 (turn flow) calls this and, on ``True``, generates the brief
|
||||
follow-on response as the silent witness. Classifier failure falls back
|
||||
to ``should_interject=False`` with ``reason="fallback"`` so the chat
|
||||
keeps moving (§3.3 graceful-degradation rule); callers that care can
|
||||
distinguish a real "no" from a degraded "no" by the reason string.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pydantic import BaseModel
|
||||
|
||||
from chat.llm.classify import classify
|
||||
from chat.llm.client import LLMClient
|
||||
|
||||
|
||||
class InterjectionDecision(BaseModel):
|
||||
"""Whether the silent witness interjects, plus a short reason.
|
||||
|
||||
Defaults are a deliberate no-op: ``should_interject=False`` with an
|
||||
empty reason. The classifier-failure fallback uses
|
||||
``reason="fallback"`` so it's distinguishable from a real "no".
|
||||
"""
|
||||
|
||||
should_interject: bool = False
|
||||
reason: str = ""
|
||||
|
||||
|
||||
_SYSTEM = (
|
||||
"You decide whether a silent witness character interjects after the "
|
||||
"addressee character finishes speaking. STRONGLY default to false — "
|
||||
"the addressee has the floor and most turns should NOT have an "
|
||||
"interjection. Only return true when the silent witness's character, "
|
||||
"given their persona and edges, would plausibly speak up: jealousy, "
|
||||
"surprise, strong agreement worth voicing, correcting a factual "
|
||||
"falsehood, urgency. Output strict JSON matching the schema."
|
||||
)
|
||||
|
||||
|
||||
async def detect_interjection(
|
||||
client: LLMClient,
|
||||
*,
|
||||
classifier_model: str,
|
||||
addressee_name: str,
|
||||
addressee_just_said: str,
|
||||
silent_witness_name: str,
|
||||
silent_witness_persona: str,
|
||||
silent_witness_edge_to_addressee: dict, # {affinity, trust, summary}
|
||||
silent_witness_edge_to_you: dict,
|
||||
you_just_said: str,
|
||||
timeout_s: float = 30.0,
|
||||
) -> InterjectionDecision:
|
||||
"""Decide whether the silent witness bot interjects after the addressee
|
||||
finishes speaking.
|
||||
|
||||
The two ``silent_witness_edge_*`` dicts carry the silent witness's
|
||||
directed edges toward the addressee and toward the user ("you"),
|
||||
each shaped ``{affinity: int, trust: int, summary: str}``. Missing
|
||||
keys fall back to a 50/50 baseline with an empty summary so this
|
||||
function tolerates partially-populated edge state without raising.
|
||||
"""
|
||||
user = (
|
||||
f"You said: {you_just_said}\n\n"
|
||||
f"{addressee_name} just said: {addressee_just_said}\n\n"
|
||||
f"Silent witness: {silent_witness_name}\n"
|
||||
f"Persona: {silent_witness_persona}\n"
|
||||
f"Edge {silent_witness_name} -> {addressee_name}: "
|
||||
f"affinity={silent_witness_edge_to_addressee.get('affinity', 50)}, "
|
||||
f"trust={silent_witness_edge_to_addressee.get('trust', 50)}, "
|
||||
f"summary={silent_witness_edge_to_addressee.get('summary', '')}\n"
|
||||
f"Edge {silent_witness_name} -> you: "
|
||||
f"affinity={silent_witness_edge_to_you.get('affinity', 50)}, "
|
||||
f"trust={silent_witness_edge_to_you.get('trust', 50)}, "
|
||||
f"summary={silent_witness_edge_to_you.get('summary', '')}\n\n"
|
||||
f"Should {silent_witness_name} interject?"
|
||||
)
|
||||
return await classify(
|
||||
client,
|
||||
model=classifier_model,
|
||||
system=_SYSTEM,
|
||||
user=user,
|
||||
schema=InterjectionDecision,
|
||||
default=InterjectionDecision(
|
||||
should_interject=False, reason="fallback"
|
||||
),
|
||||
timeout_s=timeout_s,
|
||||
)
|
||||
|
||||
|
||||
__all__ = ["InterjectionDecision", "detect_interjection"]
|
||||
@@ -76,3 +76,103 @@ def record_turn_memory(
|
||||
).fetchone()
|
||||
memory_id = row[0] if row else None
|
||||
return event_id, memory_id
|
||||
|
||||
|
||||
def _write_one_memory(
|
||||
conn: Connection,
|
||||
*,
|
||||
owner_id: str,
|
||||
chat_id: str,
|
||||
narrative_text: str,
|
||||
witness_you: int,
|
||||
witness_host: int,
|
||||
witness_guest: int,
|
||||
scene_id: int | None,
|
||||
chat_clock_at: str | None,
|
||||
source: str,
|
||||
significance: int,
|
||||
) -> tuple[int, int | None]:
|
||||
"""Append a single ``memory_written`` event for ``owner_id`` and return
|
||||
``(event_id, memory_id)`` for the projected row."""
|
||||
payload: dict = {
|
||||
"owner_id": owner_id,
|
||||
"chat_id": chat_id,
|
||||
"pov_summary": narrative_text,
|
||||
"witness_you": witness_you,
|
||||
"witness_host": witness_host,
|
||||
"witness_guest": witness_guest,
|
||||
"source": source,
|
||||
"reliability": 1.0,
|
||||
"significance": significance,
|
||||
"pinned": 0,
|
||||
"auto_pinned": 0,
|
||||
}
|
||||
if scene_id is not None:
|
||||
payload["scene_id"] = scene_id
|
||||
if chat_clock_at is not None:
|
||||
payload["chat_clock_at"] = chat_clock_at
|
||||
|
||||
event_id = append_and_apply(conn, kind="memory_written", payload=payload)
|
||||
row = conn.execute(
|
||||
"SELECT id FROM memories "
|
||||
"WHERE owner_id = ? AND chat_id = ? "
|
||||
"ORDER BY id DESC LIMIT 1",
|
||||
(owner_id, chat_id),
|
||||
).fetchone()
|
||||
memory_id = row[0] if row else None
|
||||
return event_id, memory_id
|
||||
|
||||
|
||||
def record_turn_memory_for_present(
|
||||
conn: Connection,
|
||||
*,
|
||||
chat_id: str,
|
||||
host_bot_id: str,
|
||||
guest_bot_id: str | None,
|
||||
narrative_text: str,
|
||||
scene_id: int | None = None,
|
||||
chat_clock_at: str | None = None,
|
||||
source: str = "direct",
|
||||
significance: int = 1,
|
||||
) -> dict[str, tuple[int, int | None]]:
|
||||
"""Write a ``memory_written`` event for each present bot witness.
|
||||
|
||||
Host is always written. Guest is written iff ``guest_bot_id is not
|
||||
None``. Witness flags are ``[you=1, host=1, guest=1]`` when a guest
|
||||
is present, ``[you=1, host=1, guest=0]`` otherwise.
|
||||
|
||||
Returns a mapping ``{bot_id: (event_id, memory_id)}`` so callers can
|
||||
look up the freshly-projected memory id per owner without re-querying
|
||||
the database.
|
||||
"""
|
||||
witness_guest = 1 if guest_bot_id is not None else 0
|
||||
|
||||
result: dict[str, tuple[int, int | None]] = {}
|
||||
result[host_bot_id] = _write_one_memory(
|
||||
conn,
|
||||
owner_id=host_bot_id,
|
||||
chat_id=chat_id,
|
||||
narrative_text=narrative_text,
|
||||
witness_you=1,
|
||||
witness_host=1,
|
||||
witness_guest=witness_guest,
|
||||
scene_id=scene_id,
|
||||
chat_clock_at=chat_clock_at,
|
||||
source=source,
|
||||
significance=significance,
|
||||
)
|
||||
if guest_bot_id is not None:
|
||||
result[guest_bot_id] = _write_one_memory(
|
||||
conn,
|
||||
owner_id=guest_bot_id,
|
||||
chat_id=chat_id,
|
||||
narrative_text=narrative_text,
|
||||
witness_you=1,
|
||||
witness_host=1,
|
||||
witness_guest=1,
|
||||
scene_id=scene_id,
|
||||
chat_clock_at=chat_clock_at,
|
||||
source=source,
|
||||
significance=significance,
|
||||
)
|
||||
return result
|
||||
|
||||
@@ -0,0 +1,62 @@
|
||||
"""Multi-entity state-update coordinator (T40).
|
||||
|
||||
Wraps single-pair compute_state_update to run state updates for ALL
|
||||
directed pairs of present entities. With 3 present entities (you, host,
|
||||
guest) that's 6 directed pairs. With 2 present (you, host) it's 2 pairs.
|
||||
|
||||
Calls run sequentially to respect Featherless's 2-connection cap (the
|
||||
client-level semaphore would serialize them anyway, but doing it here
|
||||
keeps the failure surface clean — a hung pair doesn't queue behind
|
||||
itself).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from chat.llm.client import LLMClient
|
||||
from chat.services.state_update import StateUpdate, compute_state_update
|
||||
|
||||
|
||||
async def compute_state_updates_for_present(
|
||||
client: LLMClient,
|
||||
*,
|
||||
classifier_model: str,
|
||||
present_ids: list[str],
|
||||
present_names: dict[str, str],
|
||||
personas: dict[str, str],
|
||||
prior_edges: dict[tuple[str, str], dict],
|
||||
recent_dialogue: list[dict],
|
||||
timeout_s: float = 30.0,
|
||||
) -> list[tuple[str, str, StateUpdate]]:
|
||||
"""Run compute_state_update for every directed pair (src != tgt) over
|
||||
``present_ids``. Returns list of ``(source_id, target_id, update)``
|
||||
tuples in the natural iteration order over ``present_ids x present_ids``.
|
||||
|
||||
A single failing pair falls back to the schema-default StateUpdate
|
||||
(zero deltas, empty facts) inside ``compute_state_update``; the batch
|
||||
keeps going.
|
||||
"""
|
||||
out: list[tuple[str, str, StateUpdate]] = []
|
||||
for src in present_ids:
|
||||
for tgt in present_ids:
|
||||
if src == tgt:
|
||||
continue
|
||||
edge = prior_edges.get((src, tgt), {})
|
||||
update = await compute_state_update(
|
||||
client,
|
||||
model=classifier_model,
|
||||
source_id=src,
|
||||
target_id=tgt,
|
||||
source_name=present_names.get(src, src),
|
||||
source_persona=personas.get(src, "") or "",
|
||||
target_name=present_names.get(tgt, tgt),
|
||||
prior_affinity=int(edge.get("affinity", 50)),
|
||||
prior_trust=int(edge.get("trust", 50)),
|
||||
prior_summary=edge.get("summary", "") or "",
|
||||
recent_dialogue=recent_dialogue,
|
||||
timeout_s=timeout_s,
|
||||
)
|
||||
out.append((src, tgt, update))
|
||||
return out
|
||||
|
||||
|
||||
__all__ = ["compute_state_updates_for_present"]
|
||||
+125
-35
@@ -37,6 +37,7 @@ import tiktoken
|
||||
from chat.llm.client import Message
|
||||
from chat.state.edges import get_edge, list_edges_for
|
||||
from chat.state.entities import get_bot, get_you
|
||||
from chat.state.group_node import get_group_node
|
||||
from chat.state.memory import search_memories
|
||||
from chat.state.world import (
|
||||
active_scene,
|
||||
@@ -206,6 +207,26 @@ def _build_previous_scene_block(pov_summary: str | None) -> str | None:
|
||||
return "PREVIOUS SCENE SUMMARY:\n" + pov_summary
|
||||
|
||||
|
||||
def _build_group_node_block(group_node: dict | None) -> str | None:
|
||||
"""Render the group-node summary + dynamic as a SHOULD-tier block.
|
||||
|
||||
Used only in 3-entity scenes (you + host + guest). Returns None when
|
||||
the row is missing or both summary and dynamic are empty.
|
||||
"""
|
||||
if not group_node:
|
||||
return None
|
||||
summary = (group_node.get("summary") or "").strip()
|
||||
dynamic = (group_node.get("dynamic") or "").strip()
|
||||
if not summary and not dynamic:
|
||||
return None
|
||||
lines = ["Group dynamic:"]
|
||||
if summary:
|
||||
lines.append(f"- Summary: {summary}")
|
||||
if dynamic:
|
||||
lines.append(f"- Dynamic: {dynamic}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _closing_instruction(speaker_name: str, addressee_name: str) -> str:
|
||||
return (
|
||||
f"Continue the scene as {speaker_name}, in their voice, responding "
|
||||
@@ -287,6 +308,7 @@ def assemble_narrative_prompt(
|
||||
budget_soft: int = 6000,
|
||||
budget_hard: int = 8000,
|
||||
encoding_name: str = "cl100k_base",
|
||||
guest_id: str | None = None,
|
||||
) -> list[Message]:
|
||||
"""Assemble the narrative prompt for ``speaker_bot_id`` to respond.
|
||||
|
||||
@@ -313,6 +335,15 @@ def assemble_narrative_prompt(
|
||||
if chat is None:
|
||||
raise ValueError(f"chat_id {chat_id!r} not found")
|
||||
|
||||
# Auto-detect guest from chat state when caller didn't pass one.
|
||||
# Phase 1 chats have ``guest_bot_id is None``; the auto-detect is a
|
||||
# no-op there and the function behaves exactly as before.
|
||||
if guest_id is None:
|
||||
guest_id = chat.get("guest_bot_id")
|
||||
# A speaker addressing themself as guest doesn't add a third party.
|
||||
if guest_id is not None and guest_id == speaker_bot_id:
|
||||
guest_id = None
|
||||
|
||||
you = get_you(conn)
|
||||
addressee_id, addressee_name = _resolve_addressee(conn, addressee, you)
|
||||
|
||||
@@ -325,9 +356,10 @@ def assemble_narrative_prompt(
|
||||
addressee_name,
|
||||
)
|
||||
|
||||
# Activity for present entities. Phase 1: you + speaker bot. (When a
|
||||
# guest is added in Phase 1+, callers that know about it can pass
|
||||
# extra activities via a future hook; for now we keep it strict.)
|
||||
# Activity for present entities. Core (MUST): you + speaker bot.
|
||||
# Phase 2 (SHOULD-tier): when a third party (guest) is present in
|
||||
# the chat, append their activity in a separate block so it can be
|
||||
# trimmed independently under tight budget.
|
||||
activities: list[dict] = []
|
||||
you_act = get_activity(conn, "you")
|
||||
if you_act is not None:
|
||||
@@ -341,6 +373,34 @@ def assemble_narrative_prompt(
|
||||
activities.append(bot_act)
|
||||
activity_block = _build_activity_block(activities)
|
||||
|
||||
# SHOULD-tier guest activity extension (Phase 2 / Task 43).
|
||||
guest_activity_block: str | None = None
|
||||
if guest_id is not None:
|
||||
guest_act = get_activity(conn, guest_id)
|
||||
if guest_act is not None:
|
||||
guest_act = dict(guest_act)
|
||||
guest_bot = get_bot(conn, guest_id)
|
||||
guest_act["_display_name"] = (
|
||||
guest_bot["name"] if guest_bot else guest_id
|
||||
)
|
||||
guest_activity_block = _build_activity_block([guest_act])
|
||||
|
||||
# SHOULD-tier group-node block (Phase 2 / Task 43): rendered only
|
||||
# when the group_node row is present AND it covers all three of
|
||||
# you + host + guest (per the Task 43 spec).
|
||||
group_node_block: str | None = None
|
||||
if guest_id is not None:
|
||||
gn = get_group_node(conn, chat_id)
|
||||
if gn is not None:
|
||||
members = set(gn.get("members") or [])
|
||||
host_id = chat.get("host_bot_id")
|
||||
required = {"you"}
|
||||
if host_id is not None:
|
||||
required.add(host_id)
|
||||
required.add(guest_id)
|
||||
if required.issubset(members):
|
||||
group_node_block = _build_group_node_block(gn)
|
||||
|
||||
container = None
|
||||
if chat.get("active_scene_id"):
|
||||
scene = get_scene(conn, chat["active_scene_id"])
|
||||
@@ -421,6 +481,8 @@ def assemble_narrative_prompt(
|
||||
include_previous_scene: bool,
|
||||
include_memories_top_k: int,
|
||||
dialogue_keep: int,
|
||||
include_guest_activity: bool = True,
|
||||
include_group_node: bool = True,
|
||||
) -> tuple[str, int, list[dict]]:
|
||||
# dialogue: keep the last `dialogue_keep` turns verbatim; older
|
||||
# turns become an "earlier:" placeholder line.
|
||||
@@ -447,6 +509,8 @@ def assemble_narrative_prompt(
|
||||
other_edges_block if include_other_edges else None,
|
||||
scene_block,
|
||||
activity_block,
|
||||
guest_activity_block if include_guest_activity else None,
|
||||
group_node_block if include_group_node else None,
|
||||
prev_block,
|
||||
memories_block,
|
||||
dialogue_block,
|
||||
@@ -463,12 +527,25 @@ def assemble_narrative_prompt(
|
||||
nice_memories_k = min(4, len(memory_summaries))
|
||||
include_prev = previous_scene_summary is not None
|
||||
include_other = other_edges_block is not None
|
||||
include_guest_activity = guest_activity_block is not None
|
||||
include_group_node = group_node_block is not None
|
||||
|
||||
body, total, _ = assemble(
|
||||
include_other_edges=include_other,
|
||||
include_previous_scene=include_prev,
|
||||
include_memories_top_k=nice_memories_k,
|
||||
dialogue_keep=nice_dialogue_keep,
|
||||
def _build(*, prev: bool, mem_k: int, dlg: int, other: bool,
|
||||
guest_act: bool, group: bool) -> tuple[str, int]:
|
||||
body, total, _ = assemble(
|
||||
include_other_edges=other,
|
||||
include_previous_scene=prev,
|
||||
include_memories_top_k=mem_k,
|
||||
dialogue_keep=dlg,
|
||||
include_guest_activity=guest_act,
|
||||
include_group_node=group,
|
||||
)
|
||||
return body, total
|
||||
|
||||
body, total = _build(
|
||||
prev=include_prev, mem_k=nice_memories_k, dlg=nice_dialogue_keep,
|
||||
other=include_other, guest_act=include_guest_activity,
|
||||
group=include_group_node,
|
||||
)
|
||||
|
||||
# If under soft, we're done.
|
||||
@@ -478,34 +555,31 @@ def assemble_narrative_prompt(
|
||||
# Drop NICE in order: previous scene → memories beyond top-2 →
|
||||
# older dialogue turns (collapse to 4).
|
||||
if include_prev:
|
||||
body, total, _ = assemble(
|
||||
include_other_edges=include_other,
|
||||
include_previous_scene=False,
|
||||
include_memories_top_k=nice_memories_k,
|
||||
dialogue_keep=nice_dialogue_keep,
|
||||
)
|
||||
include_prev = False
|
||||
body, total = _build(
|
||||
prev=include_prev, mem_k=nice_memories_k, dlg=nice_dialogue_keep,
|
||||
other=include_other, guest_act=include_guest_activity,
|
||||
group=include_group_node,
|
||||
)
|
||||
if total <= budget_soft:
|
||||
return _emit(body, user_turn_prose)
|
||||
|
||||
if nice_memories_k > 2:
|
||||
nice_memories_k = 2
|
||||
body, total, _ = assemble(
|
||||
include_other_edges=include_other,
|
||||
include_previous_scene=False,
|
||||
include_memories_top_k=nice_memories_k,
|
||||
dialogue_keep=nice_dialogue_keep,
|
||||
body, total = _build(
|
||||
prev=include_prev, mem_k=nice_memories_k, dlg=nice_dialogue_keep,
|
||||
other=include_other, guest_act=include_guest_activity,
|
||||
group=include_group_node,
|
||||
)
|
||||
if total <= budget_soft:
|
||||
return _emit(body, user_turn_prose)
|
||||
|
||||
if nice_dialogue_keep > baseline_keep:
|
||||
nice_dialogue_keep = baseline_keep
|
||||
body, total, _ = assemble(
|
||||
include_other_edges=include_other,
|
||||
include_previous_scene=False,
|
||||
include_memories_top_k=nice_memories_k,
|
||||
dialogue_keep=nice_dialogue_keep,
|
||||
body, total = _build(
|
||||
prev=include_prev, mem_k=nice_memories_k, dlg=nice_dialogue_keep,
|
||||
other=include_other, guest_act=include_guest_activity,
|
||||
group=include_group_node,
|
||||
)
|
||||
if total <= budget_soft:
|
||||
return _emit(body, user_turn_prose)
|
||||
@@ -513,21 +587,37 @@ def assemble_narrative_prompt(
|
||||
# Drop more NICE until we're under hard: memories all the way to 0.
|
||||
while nice_memories_k > 0 and total > budget_hard:
|
||||
nice_memories_k = max(0, nice_memories_k - 1)
|
||||
body, total, _ = assemble(
|
||||
include_other_edges=include_other,
|
||||
include_previous_scene=False,
|
||||
include_memories_top_k=nice_memories_k,
|
||||
dialogue_keep=nice_dialogue_keep,
|
||||
body, total = _build(
|
||||
prev=include_prev, mem_k=nice_memories_k, dlg=nice_dialogue_keep,
|
||||
other=include_other, guest_act=include_guest_activity,
|
||||
group=include_group_node,
|
||||
)
|
||||
|
||||
# Drop SHOULD-tier blocks in order: guest activity → group node →
|
||||
# other edges. (Guest activity goes first per Task 43 spec — it's
|
||||
# the most expendable additive context.)
|
||||
if include_guest_activity and total > budget_hard:
|
||||
include_guest_activity = False
|
||||
body, total = _build(
|
||||
prev=include_prev, mem_k=nice_memories_k, dlg=nice_dialogue_keep,
|
||||
other=include_other, guest_act=include_guest_activity,
|
||||
group=include_group_node,
|
||||
)
|
||||
|
||||
if include_group_node and total > budget_hard:
|
||||
include_group_node = False
|
||||
body, total = _build(
|
||||
prev=include_prev, mem_k=nice_memories_k, dlg=nice_dialogue_keep,
|
||||
other=include_other, guest_act=include_guest_activity,
|
||||
group=include_group_node,
|
||||
)
|
||||
|
||||
# Drop SHOULD: other edges.
|
||||
if include_other and total > budget_hard:
|
||||
include_other = False
|
||||
body, total, _ = assemble(
|
||||
include_other_edges=False,
|
||||
include_previous_scene=False,
|
||||
include_memories_top_k=nice_memories_k,
|
||||
dialogue_keep=nice_dialogue_keep,
|
||||
body, total = _build(
|
||||
prev=include_prev, mem_k=nice_memories_k, dlg=nice_dialogue_keep,
|
||||
other=include_other, guest_act=include_guest_activity,
|
||||
group=include_group_node,
|
||||
)
|
||||
|
||||
if total > budget_hard:
|
||||
|
||||
+110
-72
@@ -26,6 +26,22 @@ Phase 1 simplifications (per the plan's "bound it" guidance):
|
||||
so affinity/trust/knowledge reflect the new output.
|
||||
- The route does not broadcast a fresh ``turn_html`` SSE event; T34
|
||||
polishes UI swaps. The user refreshes the page to see the new turn.
|
||||
|
||||
Phase 2 changes (T44):
|
||||
|
||||
- Multi-entity prompt assembly: ``guest_id`` is forwarded to the
|
||||
prompt assembler so the regenerated narrative sees the same
|
||||
guest-aware context the original turn did.
|
||||
- Multi-witness memory write: ``record_turn_memory_for_present`` fans
|
||||
out one ``memory_written`` event per witness when a guest is present.
|
||||
- Multi-pair state-update: ``compute_state_updates_for_present`` emits
|
||||
one ``edge_update`` per directed pair across present entities. With
|
||||
three present that's six edges instead of two.
|
||||
- Interjection regeneration is **deferred to Phase 2.5**. Regenerate
|
||||
only re-streams the addressee turn for v2; ``detect_interjection``
|
||||
is not invoked here. If the prior turn fired an interjection it
|
||||
remains attached to the original assistant_turn (which is superseded
|
||||
alongside the regenerated turn) — Phase 2.5 will revisit.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
@@ -35,9 +51,9 @@ from sqlite3 import Connection
|
||||
|
||||
from chat.config import Settings
|
||||
from chat.eventlog.log import append_and_apply, append_event
|
||||
from chat.services.memory_write import record_turn_memory
|
||||
from chat.services.memory_write import record_turn_memory_for_present
|
||||
from chat.services.multi_state_update import compute_state_updates_for_present
|
||||
from chat.services.prompt import assemble_narrative_prompt
|
||||
from chat.services.state_update import compute_state_update
|
||||
from chat.state.edges import get_edge
|
||||
from chat.state.entities import get_bot, get_you
|
||||
from chat.state.world import active_scene, get_chat
|
||||
@@ -72,6 +88,16 @@ async def regenerate_assistant_turn(
|
||||
"persona": "",
|
||||
}
|
||||
|
||||
# Phase 2: surface the guest (if any) so the prompt assembler and
|
||||
# downstream multi-entity passes see the same shape post_turn does.
|
||||
guest_bot_id = chat.get("guest_bot_id")
|
||||
guest_bot: dict | None = None
|
||||
if guest_bot_id is not None:
|
||||
guest_bot = get_bot(conn, guest_bot_id)
|
||||
if guest_bot is None:
|
||||
# Stale guest reference — degrade to single-bot regenerate.
|
||||
guest_bot_id = None
|
||||
|
||||
# 1. Locate the original assistant_turn event.
|
||||
row = conn.execute(
|
||||
"SELECT payload_json FROM event_log "
|
||||
@@ -82,6 +108,17 @@ async def regenerate_assistant_turn(
|
||||
raise ValueError("assistant_turn event not found")
|
||||
original_assistant_payload = json.loads(row[0])
|
||||
original_user_turn_id = original_assistant_payload.get("user_turn_id")
|
||||
# Phase 2 v2 regenerates only the addressee turn — preserve whichever
|
||||
# bot the original turn was attributed to, falling back to the host
|
||||
# for legacy rows that pre-date multi-entity support.
|
||||
speaker_bot_id = original_assistant_payload.get("speaker_id") or host_bot_id
|
||||
if speaker_bot_id == host_bot_id:
|
||||
speaker_bot = host_bot
|
||||
elif guest_bot is not None and speaker_bot_id == guest_bot.get("id"):
|
||||
speaker_bot = guest_bot
|
||||
else:
|
||||
speaker_bot = get_bot(conn, speaker_bot_id) or host_bot
|
||||
speaker_bot_id = speaker_bot.get("id", host_bot_id)
|
||||
|
||||
# 2. Determine the prose for the new prompt and (when edited) capture
|
||||
# the user_turn_edit event up front so the new event ids exist before
|
||||
@@ -137,20 +174,26 @@ async def regenerate_assistant_turn(
|
||||
if kind in ("user_turn", "user_turn_edit"):
|
||||
recent.append({"speaker": you_name, "text": p.get("prose", "")})
|
||||
else:
|
||||
recent.append(
|
||||
{"speaker": host_bot.get("name", "bot"), "text": p.get("text", "")}
|
||||
)
|
||||
spk = p.get("speaker_id", "bot")
|
||||
spk_name = host_bot.get("name", "bot")
|
||||
if spk == host_bot_id:
|
||||
spk_name = host_bot.get("name", "bot")
|
||||
elif guest_bot is not None and spk == guest_bot.get("id"):
|
||||
spk_name = guest_bot.get("name", "bot")
|
||||
recent.append({"speaker": spk_name, "text": p.get("text", "")})
|
||||
|
||||
# 4. Assemble the narrative prompt. ``recent`` already excludes the
|
||||
# current user prose, which we pass through ``user_turn_prose``.
|
||||
# Phase 2: forward ``guest_id`` so the prompt sees the third party.
|
||||
messages = assemble_narrative_prompt(
|
||||
conn,
|
||||
chat_id=chat_id,
|
||||
speaker_bot_id=host_bot_id,
|
||||
speaker_bot_id=speaker_bot_id,
|
||||
user_turn_prose=prose_for_prompt or None,
|
||||
recent_dialogue=recent,
|
||||
budget_soft=settings.narrative_budget_soft,
|
||||
budget_hard=settings.narrative_budget_hard,
|
||||
guest_id=guest_bot_id,
|
||||
)
|
||||
|
||||
# 5. Stream the new narrative.
|
||||
@@ -164,7 +207,7 @@ async def regenerate_assistant_turn(
|
||||
accumulated.append(chunk)
|
||||
await publish(
|
||||
chat_id,
|
||||
{"event": "token", "text": chunk, "speaker_id": host_bot_id},
|
||||
{"event": "token", "text": chunk, "speaker_id": speaker_bot_id},
|
||||
)
|
||||
new_text = "".join(accumulated)
|
||||
|
||||
@@ -177,7 +220,7 @@ async def regenerate_assistant_turn(
|
||||
kind="assistant_turn",
|
||||
payload={
|
||||
"chat_id": chat_id,
|
||||
"speaker_id": host_bot_id,
|
||||
"speaker_id": speaker_bot_id,
|
||||
"text": new_text,
|
||||
"truncated": False,
|
||||
"user_turn_id": (
|
||||
@@ -196,88 +239,83 @@ async def regenerate_assistant_turn(
|
||||
)
|
||||
|
||||
# 8. Re-run downstream classifier passes (memory write + state update
|
||||
# for both directed edges). Significance is intentionally skipped on
|
||||
# regenerate (the prior score remains attached to the prior memory).
|
||||
# for every directed pair across present entities). Significance is
|
||||
# intentionally skipped on regenerate (the prior score remains
|
||||
# attached to the prior memory). Phase 2.5 will add interjection
|
||||
# regeneration; v2 leaves any prior interjection beat in place.
|
||||
scene = active_scene(conn, chat_id)
|
||||
record_turn_memory(
|
||||
record_turn_memory_for_present(
|
||||
conn,
|
||||
chat_id=chat_id,
|
||||
host_bot_id=host_bot_id,
|
||||
guest_bot_id=guest_bot_id,
|
||||
narrative_text=new_text,
|
||||
scene_id=scene["id"] if scene else None,
|
||||
chat_clock_at=chat.get("time"),
|
||||
)
|
||||
|
||||
last_at = chat.get("time")
|
||||
speaker_name = (
|
||||
speaker_bot.get("name", "bot") if speaker_bot is not None else "bot"
|
||||
)
|
||||
recent_for_update = recent + [
|
||||
{"speaker": host_bot.get("name", "bot"), "text": new_text}
|
||||
{"speaker": speaker_name, "text": new_text}
|
||||
]
|
||||
|
||||
edge_b2y = get_edge(conn, host_bot_id, "you") or {
|
||||
"affinity": 50,
|
||||
"trust": 50,
|
||||
"summary": "",
|
||||
# Build present-entity inputs for the multi-pair state-update pass.
|
||||
# Host first preserves the Phase 1 directed-pair order (host->you,
|
||||
# then you->host) so existing canned-response fixtures still line up.
|
||||
present_ids: list[str] = [host_bot_id, "you"]
|
||||
present_names: dict[str, str] = {
|
||||
host_bot_id: host_bot.get("name", "bot"),
|
||||
"you": you_name,
|
||||
}
|
||||
update_b2y = await compute_state_update(
|
||||
client,
|
||||
model=settings.classifier_model,
|
||||
source_id=host_bot_id,
|
||||
target_id="you",
|
||||
source_name=host_bot.get("name", "bot"),
|
||||
source_persona=host_bot.get("persona", "") or "",
|
||||
target_name=you_name,
|
||||
prior_affinity=edge_b2y["affinity"],
|
||||
prior_trust=edge_b2y["trust"],
|
||||
prior_summary=edge_b2y.get("summary", "") or "",
|
||||
recent_dialogue=recent_for_update,
|
||||
)
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": host_bot_id,
|
||||
"target_id": "you",
|
||||
"chat_id": chat_id,
|
||||
"affinity_delta": update_b2y.affinity_delta,
|
||||
"trust_delta": update_b2y.trust_delta,
|
||||
"knowledge_facts": update_b2y.knowledge_facts,
|
||||
"last_interaction_at": last_at,
|
||||
"last_interaction_chat_id": chat_id,
|
||||
},
|
||||
)
|
||||
personas: dict[str, str] = {
|
||||
host_bot_id: host_bot.get("persona") or "",
|
||||
"you": you_entity.get("persona") or "",
|
||||
}
|
||||
if guest_bot is not None and guest_bot_id is not None:
|
||||
present_ids.append(guest_bot_id)
|
||||
present_names[guest_bot_id] = guest_bot.get("name", "bot")
|
||||
personas[guest_bot_id] = guest_bot.get("persona") or ""
|
||||
|
||||
edge_y2b = get_edge(conn, "you", host_bot_id) or {
|
||||
"affinity": 50,
|
||||
"trust": 50,
|
||||
"summary": "",
|
||||
}
|
||||
update_y2b = await compute_state_update(
|
||||
prior_edges: dict[tuple[str, str], dict] = {}
|
||||
for src in present_ids:
|
||||
for tgt in present_ids:
|
||||
if src == tgt:
|
||||
continue
|
||||
edge = get_edge(conn, src, tgt) or {
|
||||
"affinity": 50,
|
||||
"trust": 50,
|
||||
"summary": "",
|
||||
}
|
||||
prior_edges[(src, tgt)] = edge
|
||||
|
||||
state_updates = await compute_state_updates_for_present(
|
||||
client,
|
||||
model=settings.classifier_model,
|
||||
source_id="you",
|
||||
target_id=host_bot_id,
|
||||
source_name=you_name,
|
||||
source_persona=you_entity.get("persona", "") or "",
|
||||
target_name=host_bot.get("name", "bot"),
|
||||
prior_affinity=edge_y2b["affinity"],
|
||||
prior_trust=edge_y2b["trust"],
|
||||
prior_summary=edge_y2b.get("summary", "") or "",
|
||||
classifier_model=settings.classifier_model,
|
||||
present_ids=present_ids,
|
||||
present_names=present_names,
|
||||
personas=personas,
|
||||
prior_edges=prior_edges,
|
||||
recent_dialogue=recent_for_update,
|
||||
timeout_s=settings.classifier_timeout_s,
|
||||
)
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": "you",
|
||||
"target_id": host_bot_id,
|
||||
"chat_id": chat_id,
|
||||
"affinity_delta": update_y2b.affinity_delta,
|
||||
"trust_delta": update_y2b.trust_delta,
|
||||
"knowledge_facts": update_y2b.knowledge_facts,
|
||||
"last_interaction_at": last_at,
|
||||
"last_interaction_chat_id": chat_id,
|
||||
},
|
||||
)
|
||||
for src_id, tgt_id, update in state_updates:
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": src_id,
|
||||
"target_id": tgt_id,
|
||||
"chat_id": chat_id,
|
||||
"affinity_delta": update.affinity_delta,
|
||||
"trust_delta": update.trust_delta,
|
||||
"knowledge_facts": update.knowledge_facts,
|
||||
"last_interaction_at": last_at,
|
||||
"last_interaction_chat_id": chat_id,
|
||||
},
|
||||
)
|
||||
|
||||
return new_text
|
||||
|
||||
|
||||
@@ -0,0 +1,107 @@
|
||||
"""Parse user-supplied "have they met?" prose into per-direction seed
|
||||
content for two bots' edges (T38).
|
||||
|
||||
Per Requirements §5.2, when two bots first co-appear in a chat, the user
|
||||
is offered a small drawer asking "Have they met before? If yes, write a
|
||||
short prose seed describing how." That prose lands here and is parsed
|
||||
into a :class:`RelationshipSeed` whose two halves populate the
|
||||
``botA -> botB`` and ``botB -> botA`` edges respectively (summary,
|
||||
initial knowledge facts, and small affinity/trust deltas around the
|
||||
default 50/50 baseline).
|
||||
|
||||
The two directions can differ — A may know more about B than B knows
|
||||
about A, or A may trust B less than the reverse — so the schema carries
|
||||
both halves independently.
|
||||
|
||||
Empty/whitespace-only prose short-circuits to a default
|
||||
``RelationshipSeed`` (all zeroes, empty strings); the caller treats
|
||||
that as "they haven't met" and writes no edge content. The wrapper uses
|
||||
:func:`chat.llm.classify.classify` with ``default=RelationshipSeed()``
|
||||
so a flapping classifier degrades to the same no-op rather than
|
||||
blocking the chat-creation flow (§3.3 graceful-degradation rule).
|
||||
|
||||
T42 (the inter-bot relationship drawer) calls this from the route layer.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from chat.llm.classify import classify
|
||||
from chat.llm.client import LLMClient
|
||||
|
||||
|
||||
class RelationshipSeed(BaseModel):
|
||||
"""Structured per-direction seed for two bots' edges.
|
||||
|
||||
Defaults are a deliberate no-op: empty summaries, empty knowledge
|
||||
lists, zero deltas. Both the empty-prose short-circuit and the
|
||||
classifier-failure fallback return this default so the caller can
|
||||
treat them identically.
|
||||
"""
|
||||
|
||||
a_to_b_summary: str = ""
|
||||
a_to_b_knowledge_facts: list[str] = Field(default_factory=list)
|
||||
a_to_b_affinity_delta: int = 0 # signed, -10..+10 typical
|
||||
a_to_b_trust_delta: int = 0
|
||||
b_to_a_summary: str = ""
|
||||
b_to_a_knowledge_facts: list[str] = Field(default_factory=list)
|
||||
b_to_a_affinity_delta: int = 0
|
||||
b_to_a_trust_delta: int = 0
|
||||
|
||||
|
||||
_SYSTEM = (
|
||||
"You parse a short prose seed describing how two characters know each "
|
||||
"other into structured per-direction edge content. For each direction "
|
||||
"(A -> B, B -> A) extract: summary (one sentence from that POV), "
|
||||
"knowledge_facts (list of factual claims that direction can carry "
|
||||
"into future scenes), affinity_delta (-10..+10 — small adjustments to "
|
||||
"the default 50/50 baseline), trust_delta (-10..+10). Default deltas "
|
||||
"to 0 when prose is neutral. The two directions can differ — A may "
|
||||
"trust B more than B trusts A. Output strict JSON matching the schema."
|
||||
)
|
||||
|
||||
|
||||
async def seed_inter_bot_edges(
|
||||
client: LLMClient,
|
||||
*,
|
||||
classifier_model: str,
|
||||
bot_a_id: str,
|
||||
bot_a_name: str,
|
||||
bot_b_id: str,
|
||||
bot_b_name: str,
|
||||
relationship_prose: str,
|
||||
timeout_s: float = 30.0,
|
||||
) -> RelationshipSeed:
|
||||
"""Parse user-supplied prose into structured edge content for both
|
||||
directed pairs.
|
||||
|
||||
Empty/whitespace prose short-circuits to an empty
|
||||
:class:`RelationshipSeed` (the caller treats this as "they haven't
|
||||
met" and writes no edge content). Classifier failure also returns
|
||||
the default — see module docstring for the rationale.
|
||||
|
||||
The ``bot_a_id`` / ``bot_b_id`` arguments are accepted for symmetry
|
||||
with the caller (T42's drawer route uses them when emitting
|
||||
``edge_update`` events); they're embedded in the prompt alongside
|
||||
the names so the classifier can disambiguate when names collide.
|
||||
"""
|
||||
if not relationship_prose or not relationship_prose.strip():
|
||||
return RelationshipSeed()
|
||||
user = (
|
||||
f"Bot A: {bot_a_name} (id={bot_a_id})\n"
|
||||
f"Bot B: {bot_b_name} (id={bot_b_id})\n\n"
|
||||
f"Prose seed:\n{relationship_prose.strip()}"
|
||||
)
|
||||
return await classify(
|
||||
client,
|
||||
model=classifier_model,
|
||||
system=_SYSTEM,
|
||||
user=user,
|
||||
schema=RelationshipSeed,
|
||||
default=RelationshipSeed(),
|
||||
timeout_s=timeout_s,
|
||||
)
|
||||
|
||||
|
||||
__all__ = ["RelationshipSeed", "seed_inter_bot_edges"]
|
||||
@@ -156,64 +156,50 @@ def _read_recent_dialogue(
|
||||
return out
|
||||
|
||||
|
||||
async def apply_scene_close_summary(
|
||||
async def _summarize_and_apply_for_witness(
|
||||
conn: Connection,
|
||||
client: LLMClient,
|
||||
*,
|
||||
classifier_model: str,
|
||||
chat_id: str,
|
||||
scene_id: int,
|
||||
host_bot_id: str,
|
||||
timeout_s: float = 10.0,
|
||||
bot_id: str,
|
||||
you_name: str,
|
||||
dialogue: list[dict],
|
||||
timeout_s: float,
|
||||
) -> ScenePOVSummary:
|
||||
"""Drive the per-POV summary pipeline after ``scene_closed``.
|
||||
"""Run :func:`summarize_scene` for one bot witness and apply the
|
||||
three projected updates (memory pov_summary rewrite, edge summary
|
||||
overwrite, edge knowledge_facts append).
|
||||
|
||||
Steps (Phase 1, single-bot):
|
||||
1. Gather the closing scene's dialogue from the event_log.
|
||||
2. Run :func:`summarize_scene` for the host bot.
|
||||
3. Rewrite each scene-bound memory's ``pov_summary`` via
|
||||
``manual_edit`` (target_kind ``memory_pov_summary``), capturing
|
||||
the prior value for §6.4 reversibility.
|
||||
4. Update the bot->you edge summary via ``manual_edit`` with the
|
||||
new ``edge_summary`` target_kind. v1 combines prior + new by
|
||||
concatenation — the classifier's ``relationship_summary`` is
|
||||
already phrased as a continuation.
|
||||
5. Append any new knowledge_facts to the same edge via
|
||||
``edge_update``.
|
||||
|
||||
Tolerant of missing pieces: no memories -> skip step 3 silently;
|
||||
no edge row -> skip step 4; empty knowledge_facts -> skip step 5.
|
||||
The classifier's empty default flows through harmlessly.
|
||||
Tolerant of missing pieces in the same way Phase 1 was: no memory
|
||||
row -> skip the rewrite; no edge row -> skip the edge_summary write
|
||||
(the empty-default classifier output simply yields no rewrites).
|
||||
"""
|
||||
# Local imports to keep the module-level surface tight and avoid
|
||||
# any chance of a circular dep through chat.state.*.
|
||||
from chat.state.edges import get_edge
|
||||
from chat.state.entities import get_bot, get_you
|
||||
from chat.state.entities import get_bot
|
||||
|
||||
host_bot = get_bot(conn, host_bot_id) or {"name": host_bot_id, "persona": ""}
|
||||
you_entity = get_you(conn) or {"name": "you", "persona": ""}
|
||||
bot = get_bot(conn, bot_id) or {"name": bot_id, "persona": ""}
|
||||
|
||||
dialogue = _read_recent_dialogue(conn, chat_id)
|
||||
|
||||
edge_b2y = get_edge(conn, host_bot_id, "you")
|
||||
edge_b2y = get_edge(conn, bot_id, "you")
|
||||
prior_summary = (edge_b2y or {}).get("summary", "") or ""
|
||||
|
||||
pov = await summarize_scene(
|
||||
client,
|
||||
model=classifier_model,
|
||||
bot_name=host_bot.get("name", host_bot_id),
|
||||
bot_persona=host_bot.get("persona", "") or "",
|
||||
you_name=you_entity.get("name", "you") or "you",
|
||||
bot_name=bot.get("name", bot_id),
|
||||
bot_persona=bot.get("persona", "") or "",
|
||||
you_name=you_name,
|
||||
prior_edge_summary=prior_summary,
|
||||
dialogue=dialogue,
|
||||
timeout_s=timeout_s,
|
||||
)
|
||||
|
||||
# Update memories belonging to the closed scene for the host bot.
|
||||
# Update memories belonging to the closed scene for this witness.
|
||||
cur = conn.execute(
|
||||
"SELECT id, pov_summary FROM memories "
|
||||
"WHERE scene_id = ? AND owner_id = ?",
|
||||
(scene_id, host_bot_id),
|
||||
(scene_id, bot_id),
|
||||
)
|
||||
for memory_id, prior_pov in cur.fetchall():
|
||||
if not pov.summary:
|
||||
@@ -231,7 +217,7 @@ async def apply_scene_close_summary(
|
||||
},
|
||||
)
|
||||
|
||||
# Update the bot->you edge summary if we have an edge row and a
|
||||
# Update this bot->you edge summary if we have an edge row and a
|
||||
# non-empty relationship_summary to merge.
|
||||
if edge_b2y is not None and pov.relationship_summary:
|
||||
new_summary = (
|
||||
@@ -245,7 +231,7 @@ async def apply_scene_close_summary(
|
||||
payload={
|
||||
"target_kind": "edge_summary",
|
||||
"target_id": {
|
||||
"source_id": host_bot_id,
|
||||
"source_id": bot_id,
|
||||
"target_id": "you",
|
||||
},
|
||||
"prior_value": prior_summary,
|
||||
@@ -253,13 +239,13 @@ async def apply_scene_close_summary(
|
||||
},
|
||||
)
|
||||
|
||||
# Append knowledge_facts to the bot->you edge if present.
|
||||
# Append knowledge_facts to this bot->you edge if present.
|
||||
if pov.knowledge_facts:
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": host_bot_id,
|
||||
"source_id": bot_id,
|
||||
"target_id": "you",
|
||||
"chat_id": chat_id,
|
||||
"knowledge_facts": list(pov.knowledge_facts),
|
||||
@@ -267,3 +253,107 @@ async def apply_scene_close_summary(
|
||||
)
|
||||
|
||||
return pov
|
||||
|
||||
|
||||
async def apply_scene_close_summary(
|
||||
conn: Connection,
|
||||
client: LLMClient,
|
||||
*,
|
||||
classifier_model: str,
|
||||
chat_id: str,
|
||||
scene_id: int,
|
||||
host_bot_id: str,
|
||||
timeout_s: float = 10.0,
|
||||
) -> ScenePOVSummary:
|
||||
"""Drive the per-POV summary pipeline after ``scene_closed``.
|
||||
|
||||
Phase 1 (single-bot) behavior — the host bot is summarized once and
|
||||
the result drives memory + edge rewrites — is preserved exactly when
|
||||
the chat has no guest. T45 extends this to fan out across each
|
||||
present bot witness when a guest is also in the room:
|
||||
|
||||
1. Gather the closing scene's dialogue from the event_log.
|
||||
2. For each present witness (host + guest if any), run
|
||||
:func:`summarize_scene` once with that witness's persona and
|
||||
their own prior ``bot -> you`` edge summary.
|
||||
3. For each witness independently:
|
||||
a. Rewrite each scene-bound memory's ``pov_summary`` via
|
||||
``manual_edit`` (target_kind ``memory_pov_summary``).
|
||||
b. Update that witness's ``bot -> you`` edge summary via
|
||||
``manual_edit`` (target_kind ``edge_summary``). v2 combines
|
||||
prior + classifier ``relationship_summary`` by simple
|
||||
concatenation.
|
||||
c. Append any ``knowledge_facts`` to the same edge via
|
||||
``edge_update``.
|
||||
4. If a ``group_node`` row exists for this chat, append a
|
||||
``group_node_updated`` event whose ``summary`` is the naive
|
||||
per-POV concat ``f"{name}: {summary}\\n\\n..."``. A true
|
||||
LLM-merged group view is deferred to Phase 2.5; ``dynamic``
|
||||
is left empty here for v2 (Phase 3 polishes it).
|
||||
|
||||
The host's :class:`ScenePOVSummary` is returned to preserve the
|
||||
Phase 1 callers' contract.
|
||||
"""
|
||||
# Local imports to keep the module-level surface tight and avoid
|
||||
# any chance of a circular dep through chat.state.*.
|
||||
from chat.state.entities import get_bot, get_you
|
||||
from chat.state.group_node import get_group_node
|
||||
from chat.state.world import get_chat
|
||||
|
||||
you_entity = get_you(conn) or {"name": "you", "persona": ""}
|
||||
you_name = you_entity.get("name", "you") or "you"
|
||||
|
||||
chat = get_chat(conn, chat_id) or {}
|
||||
guest_bot_id = chat.get("guest_bot_id")
|
||||
|
||||
dialogue = _read_recent_dialogue(conn, chat_id)
|
||||
|
||||
host_pov = await _summarize_and_apply_for_witness(
|
||||
conn,
|
||||
client,
|
||||
classifier_model=classifier_model,
|
||||
chat_id=chat_id,
|
||||
scene_id=scene_id,
|
||||
bot_id=host_bot_id,
|
||||
you_name=you_name,
|
||||
dialogue=dialogue,
|
||||
timeout_s=timeout_s,
|
||||
)
|
||||
|
||||
guest_pov: ScenePOVSummary | None = None
|
||||
if guest_bot_id is not None:
|
||||
guest_pov = await _summarize_and_apply_for_witness(
|
||||
conn,
|
||||
client,
|
||||
classifier_model=classifier_model,
|
||||
chat_id=chat_id,
|
||||
scene_id=scene_id,
|
||||
bot_id=guest_bot_id,
|
||||
you_name=you_name,
|
||||
dialogue=dialogue,
|
||||
timeout_s=timeout_s,
|
||||
)
|
||||
|
||||
# Group node update: naive per-POV concat for v2. Only fires when
|
||||
# both POVs ran (i.e. the guest is present) and a group_node row
|
||||
# exists for this chat.
|
||||
if guest_pov is not None and get_group_node(conn, chat_id) is not None:
|
||||
host_bot = get_bot(conn, host_bot_id) or {"name": host_bot_id}
|
||||
guest_bot = get_bot(conn, guest_bot_id) or {"name": guest_bot_id}
|
||||
host_name = host_bot.get("name", host_bot_id) or host_bot_id
|
||||
guest_name = guest_bot.get("name", guest_bot_id) or guest_bot_id
|
||||
group_summary = (
|
||||
f"{host_name}: {host_pov.summary}\n\n"
|
||||
f"{guest_name}: {guest_pov.summary}"
|
||||
)
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="group_node_updated",
|
||||
payload={
|
||||
"chat_id": chat_id,
|
||||
"summary": group_summary,
|
||||
"dynamic": "",
|
||||
},
|
||||
)
|
||||
|
||||
return host_pov
|
||||
|
||||
@@ -66,6 +66,13 @@ def _apply_bot_reset(conn: Connection, e: Event) -> None:
|
||||
"DELETE FROM edges WHERE source_id = ? OR target_id = ?",
|
||||
(bot_id, bot_id),
|
||||
)
|
||||
|
||||
# Phase 2 cascade: clear guest references in other bots' chats so the host
|
||||
# doesn't see a stale guest_bot_id pointing at this (now-purged) bot.
|
||||
conn.execute(
|
||||
"UPDATE chats SET guest_bot_id = NULL WHERE guest_bot_id = ?",
|
||||
(bot_id,),
|
||||
)
|
||||
# NOTE: bots row itself is preserved (identity, kickoff_prose intact).
|
||||
# NOTE: "you" activity (entity_id="you") may linger from a deleted chat;
|
||||
# acceptable for v1 — Phase 1.5 cleanup if needed.
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
from __future__ import annotations
|
||||
import json
|
||||
from sqlite3 import Connection
|
||||
from chat.eventlog.projector import on
|
||||
from chat.eventlog.log import Event
|
||||
|
||||
|
||||
@on("group_node_initialized")
|
||||
def _apply_group_node_initialized(conn: Connection, e: Event) -> None:
|
||||
p = e.payload
|
||||
conn.execute(
|
||||
"INSERT OR REPLACE INTO group_node "
|
||||
"(chat_id, members_json, summary, dynamic, threads_json) "
|
||||
"VALUES (?, ?, ?, ?, ?)",
|
||||
(
|
||||
p["chat_id"],
|
||||
json.dumps(p["members"]),
|
||||
p.get("summary", ""),
|
||||
p.get("dynamic", ""),
|
||||
json.dumps(p.get("threads", [])),
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
@on("group_node_updated")
|
||||
def _apply_group_node_updated(conn: Connection, e: Event) -> None:
|
||||
p = e.payload
|
||||
conn.execute(
|
||||
"UPDATE group_node SET summary = ?, dynamic = ?, updated_at = datetime('now') "
|
||||
"WHERE chat_id = ?",
|
||||
(p.get("summary", ""), p.get("dynamic", ""), p["chat_id"]),
|
||||
)
|
||||
|
||||
|
||||
def get_group_node(conn: Connection, chat_id: str) -> dict | None:
|
||||
row = conn.execute(
|
||||
"SELECT chat_id, members_json, summary, dynamic, threads_json, updated_at "
|
||||
"FROM group_node WHERE chat_id = ?",
|
||||
(chat_id,),
|
||||
).fetchone()
|
||||
if not row:
|
||||
return None
|
||||
return {
|
||||
"chat_id": row[0],
|
||||
"members": json.loads(row[1]),
|
||||
"summary": row[2],
|
||||
"dynamic": row[3],
|
||||
"threads": json.loads(row[4]),
|
||||
"updated_at": row[5],
|
||||
}
|
||||
@@ -29,6 +29,24 @@ def _apply_chat_created(conn: Connection, e: Event) -> None:
|
||||
)
|
||||
|
||||
|
||||
@on("guest_added")
|
||||
def _apply_guest_added(conn: Connection, e: Event) -> None:
|
||||
p = e.payload
|
||||
conn.execute(
|
||||
"UPDATE chats SET guest_bot_id = ? WHERE id = ?",
|
||||
(p["guest_bot_id"], p["chat_id"]),
|
||||
)
|
||||
|
||||
|
||||
@on("guest_removed")
|
||||
def _apply_guest_removed(conn: Connection, e: Event) -> None:
|
||||
p = e.payload
|
||||
conn.execute(
|
||||
"UPDATE chats SET guest_bot_id = NULL WHERE id = ?",
|
||||
(p["chat_id"],),
|
||||
)
|
||||
|
||||
|
||||
@on("container_created")
|
||||
def _apply_container_created(conn: Connection, e: Event) -> None:
|
||||
p = e.payload
|
||||
|
||||
@@ -43,6 +43,101 @@
|
||||
{% endfor %}
|
||||
</section>
|
||||
|
||||
{% if guest_bot %}
|
||||
<section class="drawer-section">
|
||||
<h3>Guest</h3>
|
||||
<p><strong>{{ guest_bot.name }}</strong></p>
|
||||
{% if guest_activity %}
|
||||
<p>{{ guest_activity.posture or "—" }} / {{ (guest_activity.action or {}).verb or "—" }}</p>
|
||||
{% if guest_activity.attention %}<p class="muted">attention: {{ guest_activity.attention }}</p>{% endif %}
|
||||
{% if guest_activity.holding %}<p class="muted">holding: {{ guest_activity.holding|join(", ") }}</p>{% endif %}
|
||||
{% else %}
|
||||
<p class="muted">No activity recorded.</p>
|
||||
{% endif %}
|
||||
|
||||
{% if edge_h2g %}
|
||||
<div class="edge-row">
|
||||
<strong>{{ host_bot.name }} → {{ guest_bot.name }}</strong>
|
||||
<p>Affinity: {{ edge_h2g.affinity }}/100 · Trust: {{ edge_h2g.trust }}/100</p>
|
||||
{% if edge_h2g.knowledge %}
|
||||
<details><summary>Knowledge ({{ edge_h2g.knowledge|length }})</summary>
|
||||
<ul>{% for fact in edge_h2g.knowledge %}<li>{{ fact }}</li>{% endfor %}</ul>
|
||||
</details>
|
||||
{% endif %}
|
||||
</div>
|
||||
{% endif %}
|
||||
{% if edge_g2h %}
|
||||
<div class="edge-row">
|
||||
<strong>{{ guest_bot.name }} → {{ host_bot.name }}</strong>
|
||||
<p>Affinity: {{ edge_g2h.affinity }}/100 · Trust: {{ edge_g2h.trust }}/100</p>
|
||||
{% if edge_g2h.knowledge %}
|
||||
<details><summary>Knowledge ({{ edge_g2h.knowledge|length }})</summary>
|
||||
<ul>{% for fact in edge_g2h.knowledge %}<li>{{ fact }}</li>{% endfor %}</ul>
|
||||
</details>
|
||||
{% endif %}
|
||||
</div>
|
||||
{% endif %}
|
||||
{% if edge_y2g %}
|
||||
<div class="edge-row">
|
||||
<strong>you → {{ guest_bot.name }}</strong>
|
||||
<p>Affinity: {{ edge_y2g.affinity }}/100 · Trust: {{ edge_y2g.trust }}/100</p>
|
||||
</div>
|
||||
{% endif %}
|
||||
{% if edge_g2y %}
|
||||
<div class="edge-row">
|
||||
<strong>{{ guest_bot.name }} → you</strong>
|
||||
<p>Affinity: {{ edge_g2y.affinity }}/100 · Trust: {{ edge_g2y.trust }}/100</p>
|
||||
</div>
|
||||
{% endif %}
|
||||
|
||||
<form class="inline-edit"
|
||||
hx-post="/chats/{{ chat.id }}/drawer/guest/remove"
|
||||
hx-target="#drawer" hx-swap="innerHTML">
|
||||
<button type="submit">Remove guest</button>
|
||||
</form>
|
||||
</section>
|
||||
{% else %}
|
||||
<section class="drawer-section">
|
||||
<h3>Add guest</h3>
|
||||
{% if available_guests %}
|
||||
<form class="inline-edit"
|
||||
hx-post="/chats/{{ chat.id }}/drawer/guest/add"
|
||||
hx-target="#drawer" hx-swap="innerHTML">
|
||||
<label>
|
||||
Bot:
|
||||
<select name="guest_bot_id" required>
|
||||
{% for b in available_guests %}
|
||||
<option value="{{ b.id }}">{{ b.name }}</option>
|
||||
{% endfor %}
|
||||
</select>
|
||||
</label>
|
||||
<label>
|
||||
Have they met before? Describe how (leave blank if not):
|
||||
<textarea name="relationship_prose" rows="3"
|
||||
placeholder="e.g. Old college friends who studied physics together."></textarea>
|
||||
</label>
|
||||
<button type="submit">Add guest</button>
|
||||
</form>
|
||||
{% else %}
|
||||
<p class="muted">No other bots authored yet.</p>
|
||||
{% endif %}
|
||||
</section>
|
||||
{% endif %}
|
||||
|
||||
{% if group_node %}
|
||||
<section class="drawer-section">
|
||||
<h3>Group</h3>
|
||||
{% if group_node.summary %}
|
||||
<p>{{ group_node.summary }}</p>
|
||||
{% else %}
|
||||
<p class="muted">No group summary yet.</p>
|
||||
{% endif %}
|
||||
{% if group_node.dynamic %}
|
||||
<p class="muted">Dynamic: {{ group_node.dynamic }}</p>
|
||||
{% endif %}
|
||||
</section>
|
||||
{% endif %}
|
||||
|
||||
<section class="drawer-section">
|
||||
<h3>Edges</h3>
|
||||
{% if edge_b2y %}
|
||||
|
||||
+221
-1
@@ -32,9 +32,11 @@ from fastapi.responses import HTMLResponse
|
||||
from fastapi.templating import Jinja2Templates
|
||||
|
||||
from chat.eventlog.log import append_and_apply
|
||||
from chat.services.relationship_seed import seed_inter_bot_edges
|
||||
from chat.services.scene_summarize import apply_scene_close_summary
|
||||
from chat.state.edges import get_edge
|
||||
from chat.state.entities import get_bot, get_you
|
||||
from chat.state.entities import get_bot, get_you, list_bots
|
||||
from chat.state.group_node import get_group_node
|
||||
from chat.state.memory import get_pinned
|
||||
from chat.state.world import active_scene, get_activity, get_chat, get_container
|
||||
from chat.web.bots import get_conn
|
||||
@@ -78,6 +80,32 @@ async def drawer(chat_id: str, request: Request, conn=Depends(get_conn)):
|
||||
edge_b2y = get_edge(conn, chat["host_bot_id"], "you")
|
||||
edge_y2b = get_edge(conn, "you", chat["host_bot_id"])
|
||||
|
||||
# T42: guest + group context. Empty defaults keep the template happy
|
||||
# when no guest is present (the relevant sections render conditionally).
|
||||
guest_bot = None
|
||||
guest_activity = None
|
||||
edge_h2g = None
|
||||
edge_g2h = None
|
||||
edge_y2g = None
|
||||
edge_g2y = None
|
||||
available_guests: list[dict] = []
|
||||
group_node = None
|
||||
if chat.get("guest_bot_id"):
|
||||
guest_bot_id = chat["guest_bot_id"]
|
||||
guest_bot = get_bot(conn, guest_bot_id)
|
||||
guest_activity = get_activity(conn, guest_bot_id)
|
||||
edge_h2g = get_edge(conn, chat["host_bot_id"], guest_bot_id)
|
||||
edge_g2h = get_edge(conn, guest_bot_id, chat["host_bot_id"])
|
||||
edge_y2g = get_edge(conn, "you", guest_bot_id)
|
||||
edge_g2y = get_edge(conn, guest_bot_id, "you")
|
||||
else:
|
||||
# Candidates for the "Add guest" dropdown — every authored bot
|
||||
# except the host (and "you", which is implicit, never a bot row).
|
||||
available_guests = [
|
||||
b for b in list_bots(conn) if b["id"] != chat["host_bot_id"]
|
||||
]
|
||||
group_node = get_group_node(conn, chat_id)
|
||||
|
||||
# Recent memories from host's POV (witness_host = 1), most recent first.
|
||||
# Raw query keeps this read self-contained — no projector helper exposes
|
||||
# "latest N for an owner" yet and the drawer is the only consumer.
|
||||
@@ -117,6 +145,14 @@ async def drawer(chat_id: str, request: Request, conn=Depends(get_conn)):
|
||||
"bot_activity": bot_activity,
|
||||
"edge_b2y": edge_b2y,
|
||||
"edge_y2b": edge_y2b,
|
||||
"guest_bot": guest_bot,
|
||||
"guest_activity": guest_activity,
|
||||
"edge_h2g": edge_h2g,
|
||||
"edge_g2h": edge_g2h,
|
||||
"edge_y2g": edge_y2g,
|
||||
"edge_g2y": edge_g2y,
|
||||
"available_guests": available_guests,
|
||||
"group_node": group_node,
|
||||
"recent_memories": recent_memories,
|
||||
"pinned": pinned,
|
||||
"pin_cap": PIN_CAP,
|
||||
@@ -304,3 +340,187 @@ async def toggle_memory_pin(
|
||||
},
|
||||
)
|
||||
return await drawer(chat_id, request, conn)
|
||||
|
||||
|
||||
# --- T42 guest add/remove -------------------------------------------------
|
||||
#
|
||||
# Adding a guest fans out into up to four events: a ``guest_added`` to flip
|
||||
# ``chats.guest_bot_id``, two ``edge_update`` events seeded from the
|
||||
# user-supplied prose (skipped when the prose is empty / the seed comes back
|
||||
# default), and a ``group_node_initialized`` if no row exists yet — three
|
||||
# entities now share the chat so the §8.4 group node becomes meaningful.
|
||||
#
|
||||
# Removing a guest first emits ``scene_closed`` for the active scene (so any
|
||||
# host -> you scene closes cleanly with the guest still in scope) before
|
||||
# clearing the guest_bot_id; per spec the next user message implicitly opens
|
||||
# a fresh you+host scene via Phase 1's mid-chat reset behavior.
|
||||
|
||||
|
||||
def _seed_is_default(seed) -> bool:
|
||||
"""Treat a seed as a no-op when both summaries are empty AND both
|
||||
delta pairs are zero AND both fact lists are empty.
|
||||
"""
|
||||
return (
|
||||
not seed.a_to_b_summary
|
||||
and not seed.b_to_a_summary
|
||||
and seed.a_to_b_affinity_delta == 0
|
||||
and seed.a_to_b_trust_delta == 0
|
||||
and seed.b_to_a_affinity_delta == 0
|
||||
and seed.b_to_a_trust_delta == 0
|
||||
and not seed.a_to_b_knowledge_facts
|
||||
and not seed.b_to_a_knowledge_facts
|
||||
)
|
||||
|
||||
|
||||
@router.post(
|
||||
"/chats/{chat_id}/drawer/guest/add",
|
||||
response_class=HTMLResponse,
|
||||
)
|
||||
async def add_guest(
|
||||
chat_id: str,
|
||||
request: Request,
|
||||
guest_bot_id: str = Form(...),
|
||||
relationship_prose: str = Form(""),
|
||||
conn=Depends(get_conn),
|
||||
client=Depends(get_llm_client),
|
||||
):
|
||||
chat = get_chat(conn, chat_id)
|
||||
if chat is None:
|
||||
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
|
||||
|
||||
if chat.get("guest_bot_id") is not None:
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="a guest is already present in this chat",
|
||||
)
|
||||
|
||||
if guest_bot_id == chat["host_bot_id"]:
|
||||
raise HTTPException(
|
||||
status_code=400, detail="guest must differ from host"
|
||||
)
|
||||
|
||||
guest_bot = get_bot(conn, guest_bot_id)
|
||||
if guest_bot is None:
|
||||
raise HTTPException(
|
||||
status_code=404, detail=f"guest bot not found: {guest_bot_id}"
|
||||
)
|
||||
|
||||
host_bot = get_bot(conn, chat["host_bot_id"])
|
||||
if host_bot is None:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"host bot not found: {chat['host_bot_id']}",
|
||||
)
|
||||
|
||||
settings = request.app.state.settings
|
||||
seed = await seed_inter_bot_edges(
|
||||
client,
|
||||
classifier_model=settings.classifier_model,
|
||||
bot_a_id=chat["host_bot_id"],
|
||||
bot_a_name=host_bot["name"],
|
||||
bot_b_id=guest_bot_id,
|
||||
bot_b_name=guest_bot["name"],
|
||||
relationship_prose=relationship_prose,
|
||||
timeout_s=settings.classifier_timeout_s,
|
||||
)
|
||||
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="guest_added",
|
||||
payload={"chat_id": chat_id, "guest_bot_id": guest_bot_id},
|
||||
)
|
||||
|
||||
# Emit edge_update only when the seed carries content. Empty prose
|
||||
# short-circuits inside ``seed_inter_bot_edges`` to a default seed,
|
||||
# so this skips the two extra log entries on the no-prose path.
|
||||
# NOTE: ``_apply_edge_update`` does not accept a ``summary`` field —
|
||||
# per-direction summary is set via the per-pov scene-close path
|
||||
# (T27), not direct edge_update. We therefore drop seed.*_summary
|
||||
# here; the deltas + knowledge_facts are what materializes.
|
||||
if not _seed_is_default(seed):
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": chat["host_bot_id"],
|
||||
"target_id": guest_bot_id,
|
||||
"chat_id": chat_id,
|
||||
"affinity_delta": seed.a_to_b_affinity_delta,
|
||||
"trust_delta": seed.a_to_b_trust_delta,
|
||||
"knowledge_facts": seed.a_to_b_knowledge_facts,
|
||||
"last_interaction_at": chat.get("time"),
|
||||
"last_interaction_chat_id": chat_id,
|
||||
},
|
||||
)
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": guest_bot_id,
|
||||
"target_id": chat["host_bot_id"],
|
||||
"chat_id": chat_id,
|
||||
"affinity_delta": seed.b_to_a_affinity_delta,
|
||||
"trust_delta": seed.b_to_a_trust_delta,
|
||||
"knowledge_facts": seed.b_to_a_knowledge_facts,
|
||||
"last_interaction_at": chat.get("time"),
|
||||
"last_interaction_chat_id": chat_id,
|
||||
},
|
||||
)
|
||||
|
||||
# Three entities now share the chat (you, host, guest) — initialize
|
||||
# the group node row if Wave 1's reader doesn't see one yet.
|
||||
if get_group_node(conn, chat_id) is None:
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="group_node_initialized",
|
||||
payload={
|
||||
"chat_id": chat_id,
|
||||
"members": ["you", chat["host_bot_id"], guest_bot_id],
|
||||
"summary": "",
|
||||
"dynamic": "",
|
||||
"threads": [],
|
||||
},
|
||||
)
|
||||
|
||||
return await drawer(chat_id, request, conn)
|
||||
|
||||
|
||||
@router.post(
|
||||
"/chats/{chat_id}/drawer/guest/remove",
|
||||
response_class=HTMLResponse,
|
||||
)
|
||||
async def remove_guest(
|
||||
chat_id: str,
|
||||
request: Request,
|
||||
conn=Depends(get_conn),
|
||||
):
|
||||
chat = get_chat(conn, chat_id)
|
||||
if chat is None:
|
||||
raise HTTPException(status_code=404, detail=f"chat not found: {chat_id}")
|
||||
|
||||
if chat.get("guest_bot_id") is None:
|
||||
raise HTTPException(
|
||||
status_code=400, detail="no guest present in this chat"
|
||||
)
|
||||
|
||||
# Close the active scene (if any) before flipping guest_bot_id so
|
||||
# the scene record carries the guest as a participant.
|
||||
scene = active_scene(conn, chat_id)
|
||||
if scene is not None:
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="scene_closed",
|
||||
payload={
|
||||
"scene_id": scene["id"],
|
||||
"ended_at": chat.get("time"),
|
||||
"significance": 0,
|
||||
},
|
||||
)
|
||||
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="guest_removed",
|
||||
payload={"chat_id": chat_id},
|
||||
)
|
||||
|
||||
return await drawer(chat_id, request, conn)
|
||||
|
||||
+418
-137
@@ -1,32 +1,47 @@
|
||||
"""POST ``/chats/<id>/turns`` — narrative turn flow with SSE streaming.
|
||||
|
||||
The turn flow strings together the pieces built in T17 (turn parser), T18
|
||||
(prompt assembler), and T16 (SSE channel):
|
||||
(prompt assembler), and T16 (SSE channel). Phase 2 (T44) extends it to
|
||||
multi-entity scenes with optional guest support and a follow-on
|
||||
interjection beat.
|
||||
|
||||
1. Parse the user's prose with the classifier into typed segments.
|
||||
2. Append a ``user_turn`` event capturing both the original prose and the
|
||||
parsed segments.
|
||||
3. Append a placeholder ``assistant_turn_started`` marker so observers know
|
||||
a response is in flight.
|
||||
4. Build the narrative prompt, dropping OOC segments before they reach the
|
||||
bot (per Requirements §6.1 the OOC convention is for the author to talk
|
||||
to the system, not to the in-fiction bot).
|
||||
5. Stream tokens from the LLM, broadcasting each chunk over the chat's SSE
|
||||
4. Detect the addressee (host vs. guest) from the prose using a simple
|
||||
word-boundary substring match — see :func:`_detect_addressee_id`.
|
||||
5. Build the narrative prompt for the addressee, dropping OOC segments
|
||||
before they reach the bot (per Requirements §6.1 the OOC convention is
|
||||
for the author to talk to the system, not to the in-fiction bot).
|
||||
6. Stream tokens from the LLM, broadcasting each chunk over the chat's SSE
|
||||
channel as a ``token`` event so any subscribed browser tab sees them
|
||||
arrive in real time.
|
||||
6. On stream complete, append an ``assistant_turn`` event with the full
|
||||
7. On stream complete, append an ``assistant_turn`` event with the full
|
||||
text and ``truncated=False``. Then run a post-turn state-update pass
|
||||
(Requirements §3.4): one classifier call per directed edge between
|
||||
present entities, each producing an ``edge_update`` event with
|
||||
affinity/trust/knowledge deltas. Finally publish a ``turn_html``
|
||||
event with a ready-to-swap HTML fragment so HTMX's SSE extension can
|
||||
append it to the timeline without a page reload.
|
||||
7. Return ``204 No Content`` — the SSE channel is the real conveyor of
|
||||
state, not the POST response body.
|
||||
affinity/trust/knowledge deltas.
|
||||
8. When a guest is present, run the interjection classifier (§6.2). If it
|
||||
fires we stream a second narrative as the silent witness, append a
|
||||
second ``assistant_turn`` event linked to the same ``user_turn_id``,
|
||||
and re-run memory + state-update for the interjector. The same
|
||||
in-flight task covers both halves so cancel collapses both.
|
||||
9. Scene-close detection runs after the (primary + optional interjection)
|
||||
beats land so the close summary sees the full closing scene. T45's
|
||||
guest-aware ``apply_scene_close_summary`` writes per-POV summaries for
|
||||
each present witness.
|
||||
10. Publish a ``turn_html`` event for each turn so HTMX's SSE extension
|
||||
can append it to the timeline without a page reload.
|
||||
11. Return ``204 No Content`` — the SSE channel is the real conveyor of
|
||||
state, not the POST response body.
|
||||
|
||||
Errors during streaming flip the assistant_turn's ``truncated`` flag to
|
||||
``True`` and we still commit what we received. ``asyncio.CancelledError``
|
||||
is treated identically and re-raised after recording the partial turn.
|
||||
A cancellation mid-interjection skips the interjector's state/memory
|
||||
follow-up so we don't run classifiers against a half-formed beat.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
@@ -34,18 +49,20 @@ from __future__ import annotations
|
||||
import asyncio
|
||||
import html
|
||||
import json
|
||||
import re
|
||||
|
||||
from fastapi import APIRouter, Depends, Form, HTTPException, Request
|
||||
from fastapi.responses import HTMLResponse, RedirectResponse, Response
|
||||
|
||||
from chat.eventlog.log import append_and_apply, append_event
|
||||
from chat.services.background import SignificanceJob
|
||||
from chat.services.memory_write import record_turn_memory
|
||||
from chat.services.interjection import detect_interjection
|
||||
from chat.services.memory_write import record_turn_memory_for_present
|
||||
from chat.services.multi_state_update import compute_state_updates_for_present
|
||||
from chat.services.prompt import assemble_narrative_prompt
|
||||
from chat.services.rewind import compute_rewind_preview, execute_rewind
|
||||
from chat.services.scene_close import detect_scene_close
|
||||
from chat.services.scene_summarize import apply_scene_close_summary
|
||||
from chat.services.state_update import compute_state_update
|
||||
from chat.services.turn_parse import ParsedTurn, parse_turn
|
||||
from chat.state.edges import get_edge
|
||||
from chat.state.entities import get_bot, get_you
|
||||
@@ -114,6 +131,84 @@ def _read_recent_dialogue(conn, chat_id: str, limit: int = 200) -> list[dict]:
|
||||
return out
|
||||
|
||||
|
||||
def _detect_addressee_id(
|
||||
prose: str, host_bot: dict, guest_bot: dict | None
|
||||
) -> str:
|
||||
"""Return the bot id of the addressee for ``prose``.
|
||||
|
||||
Phase 2 v1 uses a simple case-insensitive whole-word match. The host
|
||||
is the default — addressee flips to guest only when the guest's name
|
||||
appears in the prose AND the host's does not. If both names match
|
||||
or neither matches, the host keeps the floor. This bias keeps the
|
||||
primary speaker stable across ambiguous prose; the interjection
|
||||
branch (later in the turn flow) is how the silent witness gets a word
|
||||
in edgewise when warranted.
|
||||
"""
|
||||
if guest_bot is None:
|
||||
return host_bot["id"]
|
||||
host_name = host_bot.get("name") or ""
|
||||
guest_name = guest_bot.get("name") or ""
|
||||
host_match = bool(
|
||||
host_name
|
||||
and re.search(rf"\b{re.escape(host_name)}\b", prose, re.IGNORECASE)
|
||||
)
|
||||
guest_match = bool(
|
||||
guest_name
|
||||
and re.search(rf"\b{re.escape(guest_name)}\b", prose, re.IGNORECASE)
|
||||
)
|
||||
if guest_match and not host_match:
|
||||
return guest_bot["id"]
|
||||
return host_bot["id"]
|
||||
|
||||
|
||||
def _gather_state_update_inputs(
|
||||
conn,
|
||||
*,
|
||||
host_bot: dict,
|
||||
guest_bot: dict | None,
|
||||
you_entity: dict,
|
||||
) -> tuple[list[str], dict[str, str], dict[str, str], dict[tuple[str, str], dict]]:
|
||||
"""Collect ``(present_ids, present_names, personas, prior_edges)`` for
|
||||
a multi-entity state-update pass.
|
||||
|
||||
Phase 2 v1 always pairs ``you`` with the host and (when present) the
|
||||
guest. ``prior_edges`` falls back to the schema default 50/50 baseline
|
||||
when no row exists yet — that mirrors the Phase 1 single-pair flow.
|
||||
|
||||
Order matters: the host comes first so the directed-pair iteration
|
||||
in :func:`compute_state_updates_for_present` matches the Phase 1
|
||||
sequence (host->you, then you->host). Existing tests pin the canned-
|
||||
response queue to that order — keeping it stable means we don't
|
||||
have to reshuffle test fixtures across the Phase 2 cutover.
|
||||
"""
|
||||
present_ids: list[str] = [host_bot["id"], "you"]
|
||||
present_names: dict[str, str] = {
|
||||
host_bot["id"]: host_bot["name"],
|
||||
"you": you_entity.get("name") or "you",
|
||||
}
|
||||
personas: dict[str, str] = {
|
||||
host_bot["id"]: host_bot.get("persona") or "",
|
||||
"you": you_entity.get("persona") or "",
|
||||
}
|
||||
if guest_bot is not None:
|
||||
present_ids.append(guest_bot["id"])
|
||||
present_names[guest_bot["id"]] = guest_bot["name"]
|
||||
personas[guest_bot["id"]] = guest_bot.get("persona") or ""
|
||||
|
||||
prior_edges: dict[tuple[str, str], dict] = {}
|
||||
for src in present_ids:
|
||||
for tgt in present_ids:
|
||||
if src == tgt:
|
||||
continue
|
||||
edge = get_edge(conn, src, tgt) or {
|
||||
"affinity": 50,
|
||||
"trust": 50,
|
||||
"summary": "",
|
||||
}
|
||||
prior_edges[(src, tgt)] = edge
|
||||
return present_ids, present_names, personas, prior_edges
|
||||
|
||||
|
||||
@router.post("/chats/{chat_id}/turns")
|
||||
async def post_turn(
|
||||
chat_id: str,
|
||||
@@ -137,6 +232,15 @@ async def post_turn(
|
||||
detail=f"host bot not found: {chat['host_bot_id']}",
|
||||
)
|
||||
|
||||
guest_bot = None
|
||||
guest_bot_id = chat.get("guest_bot_id")
|
||||
if guest_bot_id is not None:
|
||||
guest_bot = get_bot(conn, guest_bot_id)
|
||||
# If the chat references a deleted guest we degrade to single-bot
|
||||
# rather than 404 — the chat is still usable as a 1:1.
|
||||
if guest_bot is None:
|
||||
guest_bot_id = None
|
||||
|
||||
settings = request.app.state.settings
|
||||
|
||||
# 1. Parse turn (classifier).
|
||||
@@ -156,7 +260,16 @@ async def post_turn(
|
||||
},
|
||||
)
|
||||
|
||||
# 3. Append assistant_turn_started placeholder. ``user_turn``,
|
||||
# 3. Determine the addressee. Done before assistant_turn_started so the
|
||||
# placeholder reflects the bot the user is actually talking to (host
|
||||
# in 1:1, host-or-guest in multi-entity).
|
||||
addressee_id = _detect_addressee_id(prose, host_bot, guest_bot)
|
||||
addressee_bot = (
|
||||
guest_bot if (guest_bot is not None and addressee_id == guest_bot["id"])
|
||||
else host_bot
|
||||
)
|
||||
|
||||
# 4. Append assistant_turn_started placeholder. ``user_turn``,
|
||||
# ``assistant_turn_started``, and ``assistant_turn`` have no registered
|
||||
# projector handlers — they live in the event_log purely for transcript
|
||||
# rendering — so we don't call ``project`` here. (Re-projecting now would
|
||||
@@ -166,12 +279,15 @@ async def post_turn(
|
||||
kind="assistant_turn_started",
|
||||
payload={
|
||||
"chat_id": chat_id,
|
||||
"speaker_id": host_bot["id"],
|
||||
"speaker_id": addressee_bot["id"],
|
||||
"user_turn_id": user_turn_event_id,
|
||||
},
|
||||
)
|
||||
|
||||
# 4. Build the narrative prompt.
|
||||
# 5. Build the narrative prompt for the addressee. ``guest_id`` is
|
||||
# passed explicitly so the prompt assembler renders the guest's
|
||||
# activity / group-node block when applicable. The assembler is
|
||||
# tolerant of ``guest_id is None`` so this is a no-op for 1:1 chats.
|
||||
recent = _read_recent_dialogue(conn, chat_id, limit=20)
|
||||
# Drop the just-appended user turn from ``recent`` — it's passed as
|
||||
# ``user_turn_prose`` to the assembler and would otherwise duplicate.
|
||||
@@ -180,189 +296,327 @@ async def post_turn(
|
||||
messages = assemble_narrative_prompt(
|
||||
conn,
|
||||
chat_id=chat_id,
|
||||
speaker_bot_id=host_bot["id"],
|
||||
speaker_bot_id=addressee_bot["id"],
|
||||
user_turn_prose=prompt_prose if prompt_prose else None,
|
||||
recent_dialogue=recent,
|
||||
budget_soft=settings.narrative_budget_soft,
|
||||
budget_hard=settings.narrative_budget_hard,
|
||||
guest_id=guest_bot_id,
|
||||
)
|
||||
|
||||
# 5. Stream and accumulate tokens. The stream runs as a Task so the
|
||||
# 6. Stream and accumulate tokens. The stream runs as a Task so the
|
||||
# /turns/cancel route can invoke ``Task.cancel()`` to abort it
|
||||
# mid-stream. ``accumulated`` is a closure over the inner coroutine,
|
||||
# so when the await on ``stream_task`` raises CancelledError below
|
||||
# we still see whatever tokens were appended before cancellation.
|
||||
accumulated: list[str] = []
|
||||
truncated = False
|
||||
primary_accumulated: list[str] = []
|
||||
primary_truncated = False
|
||||
cancelled = False
|
||||
|
||||
async def _stream() -> None:
|
||||
async def _stream_primary() -> None:
|
||||
async for chunk in client.stream(
|
||||
messages,
|
||||
model=settings.narrative_model,
|
||||
max_tokens=settings.narrative_max_tokens,
|
||||
temperature=settings.narrative_temperature,
|
||||
):
|
||||
accumulated.append(chunk)
|
||||
primary_accumulated.append(chunk)
|
||||
await publish(
|
||||
chat_id,
|
||||
{
|
||||
"event": "token",
|
||||
"text": chunk,
|
||||
"speaker_id": host_bot["id"],
|
||||
"speaker_id": addressee_bot["id"],
|
||||
},
|
||||
)
|
||||
|
||||
stream_task = asyncio.create_task(_stream())
|
||||
stream_task = asyncio.create_task(_stream_primary())
|
||||
_in_flight_tasks[chat_id] = stream_task
|
||||
try:
|
||||
await stream_task
|
||||
except asyncio.CancelledError:
|
||||
# Preserve the partial output before letting the cancellation
|
||||
# propagate so the transcript reflects what the user actually saw.
|
||||
truncated = True
|
||||
primary_truncated = True
|
||||
cancelled = True
|
||||
except Exception:
|
||||
# Surface as a truncated turn rather than losing the partial output.
|
||||
truncated = True
|
||||
primary_truncated = True
|
||||
finally:
|
||||
# Always unregister so a subsequent turn can register a fresh task.
|
||||
_in_flight_tasks.pop(chat_id, None)
|
||||
|
||||
full_text = "".join(accumulated)
|
||||
primary_text = "".join(primary_accumulated)
|
||||
|
||||
# 6. Append the assistant_turn with the final text. (See note above on
|
||||
# 7. Append the assistant_turn with the final text. (See note above on
|
||||
# why we skip ``project`` for these transcript-only event kinds.)
|
||||
append_event(
|
||||
conn,
|
||||
kind="assistant_turn",
|
||||
payload={
|
||||
"chat_id": chat_id,
|
||||
"speaker_id": host_bot["id"],
|
||||
"text": full_text,
|
||||
"truncated": truncated,
|
||||
"speaker_id": addressee_bot["id"],
|
||||
"text": primary_text,
|
||||
"truncated": primary_truncated,
|
||||
"user_turn_id": user_turn_event_id,
|
||||
},
|
||||
)
|
||||
|
||||
# 6a. Per-turn memory write (Plan §11.1, T21). Phase 1 single-bot:
|
||||
# only the host bot has a memory store, witness flags are
|
||||
# ``[you=1, host=1, guest=0]``, and ``pov_summary`` is the raw
|
||||
# narrative text (T27 will rewrite at scene close). Significance
|
||||
# defaults to 1; T22's async classifier pass will overwrite it.
|
||||
# 7a. Per-turn memory write (Plan §11.1, T21 / T41). With a guest
|
||||
# present this fans out to one ``memory_written`` event per witness
|
||||
# (host + guest); without a guest it preserves the Phase 1 single
|
||||
# write keyed on the host. Witness flags are set inside the helper.
|
||||
scene = active_scene(conn, chat_id)
|
||||
_event_id, memory_id = record_turn_memory(
|
||||
memory_results = record_turn_memory_for_present(
|
||||
conn,
|
||||
chat_id=chat_id,
|
||||
host_bot_id=host_bot["id"],
|
||||
narrative_text=full_text,
|
||||
guest_bot_id=guest_bot_id,
|
||||
narrative_text=primary_text,
|
||||
scene_id=scene["id"] if scene else None,
|
||||
chat_clock_at=chat.get("time"),
|
||||
)
|
||||
|
||||
# 6b. Post-turn state-update pass (Requirements §3.4). For Phase 1
|
||||
# the only present entities are ``you`` and ``host_bot`` so we run
|
||||
# two classifier calls — one per directed edge — and append the
|
||||
# resulting ``edge_update`` events. The recent-dialogue slice is
|
||||
# re-read here so the pass sees the just-appended assistant turn.
|
||||
# We use ``append_and_apply`` (vs append + project) because the
|
||||
# edge_update handler is *not* replay-safe: re-projecting prior
|
||||
# events would re-apply their deltas on top of the live row.
|
||||
recent_for_update = _read_recent_dialogue(conn, chat_id, limit=10)
|
||||
# 7b. Post-turn state-update pass (Requirements §3.4 / T40). All
|
||||
# directed pairs over the present entities — 2 pairs for 1:1, 6 for
|
||||
# 3-entity scenes. Run sequentially via the inner helper which honors
|
||||
# the Featherless 2-conn cap.
|
||||
you_entity = get_you(conn) or {"name": "you", "persona": ""}
|
||||
last_at = chat.get("time")
|
||||
recent_for_update = _read_recent_dialogue(conn, chat_id, limit=10)
|
||||
|
||||
edge_b2y = get_edge(conn, host_bot["id"], "you") or {
|
||||
"affinity": 50,
|
||||
"trust": 50,
|
||||
"summary": "",
|
||||
}
|
||||
update_b2y = await compute_state_update(
|
||||
present_ids, present_names, personas, prior_edges = (
|
||||
_gather_state_update_inputs(
|
||||
conn,
|
||||
host_bot=host_bot,
|
||||
guest_bot=guest_bot,
|
||||
you_entity=you_entity,
|
||||
)
|
||||
)
|
||||
|
||||
state_updates = await compute_state_updates_for_present(
|
||||
client,
|
||||
model=settings.classifier_model,
|
||||
source_id=host_bot["id"],
|
||||
target_id="you",
|
||||
source_name=host_bot["name"],
|
||||
source_persona=host_bot.get("persona", ""),
|
||||
target_name=you_entity.get("name", "you"),
|
||||
prior_affinity=edge_b2y["affinity"],
|
||||
prior_trust=edge_b2y["trust"],
|
||||
prior_summary=edge_b2y.get("summary", "") or "",
|
||||
classifier_model=settings.classifier_model,
|
||||
present_ids=present_ids,
|
||||
present_names=present_names,
|
||||
personas=personas,
|
||||
prior_edges=prior_edges,
|
||||
recent_dialogue=recent_for_update,
|
||||
timeout_s=settings.classifier_timeout_s,
|
||||
)
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": host_bot["id"],
|
||||
"target_id": "you",
|
||||
"chat_id": chat_id,
|
||||
"affinity_delta": update_b2y.affinity_delta,
|
||||
"trust_delta": update_b2y.trust_delta,
|
||||
"knowledge_facts": update_b2y.knowledge_facts,
|
||||
"last_interaction_at": last_at,
|
||||
"last_interaction_chat_id": chat_id,
|
||||
},
|
||||
)
|
||||
for src_id, tgt_id, update in state_updates:
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": src_id,
|
||||
"target_id": tgt_id,
|
||||
"chat_id": chat_id,
|
||||
"affinity_delta": update.affinity_delta,
|
||||
"trust_delta": update.trust_delta,
|
||||
"knowledge_facts": update.knowledge_facts,
|
||||
"last_interaction_at": last_at,
|
||||
"last_interaction_chat_id": chat_id,
|
||||
},
|
||||
)
|
||||
|
||||
edge_y2b = get_edge(conn, "you", host_bot["id"]) or {
|
||||
"affinity": 50,
|
||||
"trust": 50,
|
||||
"summary": "",
|
||||
}
|
||||
update_y2b = await compute_state_update(
|
||||
client,
|
||||
model=settings.classifier_model,
|
||||
source_id="you",
|
||||
target_id=host_bot["id"],
|
||||
source_name=you_entity.get("name", "you"),
|
||||
source_persona=you_entity.get("persona", "") or "",
|
||||
target_name=host_bot["name"],
|
||||
prior_affinity=edge_y2b["affinity"],
|
||||
prior_trust=edge_y2b["trust"],
|
||||
prior_summary=edge_y2b.get("summary", "") or "",
|
||||
recent_dialogue=recent_for_update,
|
||||
)
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": "you",
|
||||
"target_id": host_bot["id"],
|
||||
"chat_id": chat_id,
|
||||
"affinity_delta": update_y2b.affinity_delta,
|
||||
"trust_delta": update_y2b.trust_delta,
|
||||
"knowledge_facts": update_y2b.knowledge_facts,
|
||||
"last_interaction_at": last_at,
|
||||
"last_interaction_chat_id": chat_id,
|
||||
},
|
||||
)
|
||||
|
||||
# 6c. Enqueue the async significance pass (Plan §11.1, T22). The
|
||||
# 7c. Enqueue the async significance pass (Plan §11.1, T22). The
|
||||
# worker scores the just-written memory 0-3, updates significance,
|
||||
# and auto-pins on score 3 with the §8.5 soft-cap eviction rule.
|
||||
# Enqueued before the broadcast so it's outstanding by the time the
|
||||
# client sees ``turn_html`` — but the worker is async, so the user
|
||||
# never blocks on it.
|
||||
# Phase 2 picks the host's memory id as the canonical input — guest
|
||||
# POV memories piggyback on the same significance score (the prose
|
||||
# they record is identical for v2; per-POV rewrite happens at scene
|
||||
# close in T45 and downstream-of-significance).
|
||||
worker = getattr(request.app.state, "background_worker", None)
|
||||
if worker is not None and memory_id is not None:
|
||||
host_event_memory = memory_results.get(host_bot["id"])
|
||||
host_memory_id = host_event_memory[1] if host_event_memory else None
|
||||
if worker is not None and host_memory_id is not None:
|
||||
worker.enqueue(
|
||||
SignificanceJob(
|
||||
memory_id=memory_id,
|
||||
narrative_text=full_text,
|
||||
memory_id=host_memory_id,
|
||||
narrative_text=primary_text,
|
||||
prior_dialogue=recent_for_update,
|
||||
host_bot_id=host_bot["id"],
|
||||
)
|
||||
)
|
||||
|
||||
# 6d. Scene-close detection (Plan §7.2, T26). Runs AFTER assistant_turn
|
||||
# so the bot's response is the closing scene's final beat — closing
|
||||
# before narrative would force the bot to speak "in no scene", which
|
||||
# is awkward. Hard signals only in Phase 1: container change parsed
|
||||
# from prose, or explicit "fade out" / "we're done here" patterns.
|
||||
# On classifier failure the service returns ``should_close=False``
|
||||
# so the turn flow keeps moving; the manual close button in the
|
||||
# drawer is the always-available fallback.
|
||||
# 8. Interjection branch (T39 / T44). Only fires when the chat has a
|
||||
# guest AND the addressee was the bot we *can* interject for (i.e.
|
||||
# not the lone bot in a 1:1 chat). The silent witness is whichever
|
||||
# bot didn't get the addressee slot. We only run this when the
|
||||
# primary stream actually completed — a cancelled or errored primary
|
||||
# short-circuits the follow-on so we don't classifier-spam against a
|
||||
# half-formed beat.
|
||||
interjection_text: str | None = None
|
||||
interjection_speaker_id: str | None = None
|
||||
interjection_truncated = False
|
||||
if (
|
||||
guest_bot is not None
|
||||
and not cancelled
|
||||
and not primary_truncated
|
||||
and primary_text.strip()
|
||||
):
|
||||
# Identify the silent witness — the bot that is NOT the addressee.
|
||||
if addressee_id == host_bot["id"]:
|
||||
silent_witness = guest_bot
|
||||
else:
|
||||
silent_witness = host_bot
|
||||
|
||||
edge_w_to_addr = get_edge(
|
||||
conn, silent_witness["id"], addressee_bot["id"]
|
||||
) or {"affinity": 50, "trust": 50, "summary": ""}
|
||||
edge_w_to_you = get_edge(conn, silent_witness["id"], "you") or {
|
||||
"affinity": 50,
|
||||
"trust": 50,
|
||||
"summary": "",
|
||||
}
|
||||
|
||||
decision = await detect_interjection(
|
||||
client,
|
||||
classifier_model=settings.classifier_model,
|
||||
addressee_name=addressee_bot["name"],
|
||||
addressee_just_said=primary_text,
|
||||
silent_witness_name=silent_witness["name"],
|
||||
silent_witness_persona=silent_witness.get("persona") or "",
|
||||
silent_witness_edge_to_addressee=edge_w_to_addr,
|
||||
silent_witness_edge_to_you=edge_w_to_you,
|
||||
you_just_said=prose,
|
||||
timeout_s=settings.classifier_timeout_s,
|
||||
)
|
||||
|
||||
if decision.should_interject:
|
||||
interjection_speaker_id = silent_witness["id"]
|
||||
|
||||
# Re-read recent_dialogue so the just-appended assistant_turn
|
||||
# (the addressee's beat) is in the prompt context.
|
||||
interject_recent = _read_recent_dialogue(conn, chat_id, limit=20)
|
||||
if interject_recent and interject_recent[-1].get("speaker") == "you":
|
||||
interject_recent = interject_recent[:-1]
|
||||
interject_messages = assemble_narrative_prompt(
|
||||
conn,
|
||||
chat_id=chat_id,
|
||||
speaker_bot_id=silent_witness["id"],
|
||||
addressee=addressee_bot["id"],
|
||||
user_turn_prose=prompt_prose if prompt_prose else None,
|
||||
recent_dialogue=interject_recent,
|
||||
budget_soft=settings.narrative_budget_soft,
|
||||
budget_hard=settings.narrative_budget_hard,
|
||||
guest_id=guest_bot_id,
|
||||
)
|
||||
|
||||
interject_accumulated: list[str] = []
|
||||
|
||||
async def _stream_interjection() -> None:
|
||||
async for chunk in client.stream(
|
||||
interject_messages,
|
||||
model=settings.narrative_model,
|
||||
max_tokens=settings.narrative_max_tokens,
|
||||
temperature=settings.narrative_temperature,
|
||||
):
|
||||
interject_accumulated.append(chunk)
|
||||
await publish(
|
||||
chat_id,
|
||||
{
|
||||
"event": "token",
|
||||
"text": chunk,
|
||||
"speaker_id": silent_witness["id"],
|
||||
},
|
||||
)
|
||||
|
||||
interject_task = asyncio.create_task(_stream_interjection())
|
||||
_in_flight_tasks[chat_id] = interject_task
|
||||
try:
|
||||
await interject_task
|
||||
except asyncio.CancelledError:
|
||||
interjection_truncated = True
|
||||
cancelled = True
|
||||
except Exception:
|
||||
interjection_truncated = True
|
||||
finally:
|
||||
_in_flight_tasks.pop(chat_id, None)
|
||||
|
||||
interjection_text = "".join(interject_accumulated)
|
||||
|
||||
append_event(
|
||||
conn,
|
||||
kind="assistant_turn",
|
||||
payload={
|
||||
"chat_id": chat_id,
|
||||
"speaker_id": silent_witness["id"],
|
||||
"text": interjection_text,
|
||||
"truncated": interjection_truncated,
|
||||
"user_turn_id": user_turn_event_id,
|
||||
"interjection_of": addressee_bot["id"],
|
||||
},
|
||||
)
|
||||
|
||||
# Skip the downstream classifier passes if the interjection
|
||||
# was cancelled mid-stream — we don't want to score a partial
|
||||
# beat the user never got to read in full.
|
||||
if not interjection_truncated:
|
||||
# Re-run the multi-pair state update — the interjector
|
||||
# adding their voice plausibly shifts edges for everyone
|
||||
# in the room. Idempotent enough for v2 (deltas accumulate;
|
||||
# no stale state). Re-read recent so the just-appended
|
||||
# interjection turn is in scope.
|
||||
recent_post_interject = _read_recent_dialogue(
|
||||
conn, chat_id, limit=10
|
||||
)
|
||||
# Re-fetch prior edges so deltas land on the post-primary
|
||||
# state rather than the pre-turn baseline.
|
||||
_, _, _, prior_edges_post = _gather_state_update_inputs(
|
||||
conn,
|
||||
host_bot=host_bot,
|
||||
guest_bot=guest_bot,
|
||||
you_entity=you_entity,
|
||||
)
|
||||
state_updates_post = await compute_state_updates_for_present(
|
||||
client,
|
||||
classifier_model=settings.classifier_model,
|
||||
present_ids=present_ids,
|
||||
present_names=present_names,
|
||||
personas=personas,
|
||||
prior_edges=prior_edges_post,
|
||||
recent_dialogue=recent_post_interject,
|
||||
timeout_s=settings.classifier_timeout_s,
|
||||
)
|
||||
for src_id, tgt_id, update in state_updates_post:
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": src_id,
|
||||
"target_id": tgt_id,
|
||||
"chat_id": chat_id,
|
||||
"affinity_delta": update.affinity_delta,
|
||||
"trust_delta": update.trust_delta,
|
||||
"knowledge_facts": update.knowledge_facts,
|
||||
"last_interaction_at": last_at,
|
||||
"last_interaction_chat_id": chat_id,
|
||||
},
|
||||
)
|
||||
|
||||
# Memory write for the interjection beat — a second pair
|
||||
# of memory_written events (host + guest POVs).
|
||||
record_turn_memory_for_present(
|
||||
conn,
|
||||
chat_id=chat_id,
|
||||
host_bot_id=host_bot["id"],
|
||||
guest_bot_id=guest_bot_id,
|
||||
narrative_text=interjection_text,
|
||||
scene_id=scene["id"] if scene else None,
|
||||
chat_clock_at=chat.get("time"),
|
||||
)
|
||||
|
||||
# 9. Scene-close detection (Plan §7.2, T26). Runs AFTER assistant_turn
|
||||
# and the optional interjection so the bots' responses are part of
|
||||
# the closing scene's final beat — closing before narrative would
|
||||
# force the bot to speak "in no scene", which is awkward. Hard
|
||||
# signals only in Phase 1: container change parsed from prose, or
|
||||
# explicit "fade out" / "we're done here" patterns. On classifier
|
||||
# failure the service returns ``should_close=False`` so the turn
|
||||
# flow keeps moving; the manual close button in the drawer is the
|
||||
# always-available fallback.
|
||||
#
|
||||
# Skip empty prose — no signal to classify and no point spending a
|
||||
# round-trip. Skip when there's no active scene (e.g. after a prior
|
||||
@@ -393,11 +647,12 @@ async def post_turn(
|
||||
"significance": 0,
|
||||
},
|
||||
)
|
||||
# T27: per-POV summary + edge summary update + knowledge
|
||||
# promotion. Runs synchronously after the close so the
|
||||
# next turn (or a subsequent GET /chats/<id>) sees the
|
||||
# rewritten memories and edge summary. Tolerates classifier
|
||||
# failure (returns the empty default and skips the writes).
|
||||
# T27 / T45: per-POV summary + edge summary update + knowledge
|
||||
# promotion for each present witness (host always; guest when
|
||||
# present). Runs synchronously after the close so the next
|
||||
# turn (or a subsequent GET /chats/<id>) sees the rewritten
|
||||
# memories and edge summaries. Tolerates classifier failure
|
||||
# (returns the empty default and skips the writes).
|
||||
await apply_scene_close_summary(
|
||||
conn,
|
||||
client,
|
||||
@@ -408,24 +663,50 @@ async def post_turn(
|
||||
timeout_s=settings.classifier_timeout_s,
|
||||
)
|
||||
|
||||
# 7. Broadcast a JSON completion event (for JS consumers) and an HTML
|
||||
# fragment event (for HTMX SSE swap-into-timeline).
|
||||
# 10. Broadcast a JSON completion event (for JS consumers) and an HTML
|
||||
# fragment event (for HTMX SSE swap-into-timeline). One pair per
|
||||
# written assistant_turn so the timeline ends up with both the
|
||||
# primary and the interjection beat in the right order.
|
||||
await publish(
|
||||
chat_id,
|
||||
{
|
||||
"event": "assistant_turn_complete",
|
||||
"speaker_id": host_bot["id"],
|
||||
"text": full_text,
|
||||
"truncated": truncated,
|
||||
"speaker_id": addressee_bot["id"],
|
||||
"text": primary_text,
|
||||
"truncated": primary_truncated,
|
||||
},
|
||||
)
|
||||
assistant_html = _render_turn_html(
|
||||
host_bot["name"], full_text, role="bot"
|
||||
primary_html = _render_turn_html(
|
||||
addressee_bot["name"], primary_text, role="bot"
|
||||
)
|
||||
await publish(
|
||||
chat_id, {"event": "turn_html", "data": assistant_html}
|
||||
chat_id, {"event": "turn_html", "data": primary_html}
|
||||
)
|
||||
|
||||
if interjection_text is not None and interjection_speaker_id is not None:
|
||||
# The interjector's display name is whichever bot wasn't the
|
||||
# addressee — pull it from the in-scope variable directly.
|
||||
interject_speaker_name = (
|
||||
host_bot["name"]
|
||||
if interjection_speaker_id == host_bot["id"]
|
||||
else (guest_bot["name"] if guest_bot is not None else "bot")
|
||||
)
|
||||
await publish(
|
||||
chat_id,
|
||||
{
|
||||
"event": "assistant_turn_complete",
|
||||
"speaker_id": interjection_speaker_id,
|
||||
"text": interjection_text,
|
||||
"truncated": interjection_truncated,
|
||||
},
|
||||
)
|
||||
interject_html = _render_turn_html(
|
||||
interject_speaker_name, interjection_text, role="bot"
|
||||
)
|
||||
await publish(
|
||||
chat_id, {"event": "turn_html", "data": interject_html}
|
||||
)
|
||||
|
||||
if cancelled:
|
||||
# Re-raise after the partial-turn has been recorded.
|
||||
raise asyncio.CancelledError
|
||||
|
||||
@@ -499,6 +499,8 @@ Written per witness when a scene closes. Different details, different interpreta
|
||||
|
||||
### Phase 2 — multi-entity
|
||||
|
||||
**Status: shipped 2026-04-26** — multi-entity scene support, guest add/remove drawer UX, guest-aware prompt assembly, multi-entity turn flow with interjection classifier, per-POV scene close summaries for every present witness, group_node initialization/update, and bot reset cascade clearing stale `chats.guest_bot_id` references all landed across the wave5 task series (see `CLAUDE.md` § "Phase 2 status" for the deliverable summary and follow-ups).
|
||||
|
||||
- Guest bot in chat (3-entity scene config).
|
||||
- Interjection classifier call.
|
||||
- Witness filtering across multiple owners.
|
||||
|
||||
@@ -0,0 +1,596 @@
|
||||
# Roleplay Engine — Phase 2.5 Cleanup Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use `superpowers-extended-cc:executing-plans` to implement this plan task-by-task. Use the parallel-dispatch pattern documented under "Parallel-Execution Strategy" for waves that fan out to multiple subagents.
|
||||
|
||||
**Goal:** Burn down the combined Phase 1.5 + Phase 2.5/3 backlog tracked in [`CLAUDE.md`](../../CLAUDE.md) §"Phase 1.5 cleanup backlog" and §"Phase 2.5 / 3 backlog". 15 follow-up items consolidated into 8 tasks (file-disjoint across waves) so several can run in parallel.
|
||||
|
||||
**Architecture:** No new architecture. Every change here is either a refactor (T68 `open_db`), a polish on an existing service/route (most tasks), or a UI affordance for state that already exists (T72 drawer edits, witness-flag editing). No new tables, no new event kinds, no schema migrations.
|
||||
|
||||
**Tech Stack:** Same as Phase 2. No new dependencies.
|
||||
|
||||
**Source-of-truth references:**
|
||||
|
||||
- Backlog list: [`CLAUDE.md`](../../CLAUDE.md) §"Phase 1.5 cleanup backlog" (5 items) + §"Phase 2.5 / 3 backlog" (10 items) = 15 items total.
|
||||
- Conventions: [`CLAUDE.md`](../../CLAUDE.md) §"Behavioral defaults" + §"Phase 2 status".
|
||||
- Phase 2 plan (style, TDD pattern, parallel-dispatch mechanics): [2026-04-26-v2-phase2-implementation.md](2026-04-26-v2-phase2-implementation.md).
|
||||
- Phase 3 plan (in flight on a separate branch): [2026-04-26-v3-phase3-implementation.md](2026-04-26-v3-phase3-implementation.md).
|
||||
|
||||
When a task says "see §X", that's the requirements doc unless stated otherwise.
|
||||
|
||||
---
|
||||
|
||||
## Pre-flight
|
||||
|
||||
**Branch:** create `phase-2.5` from the latest `main` after Phase 2 has merged. If Phase 2 is still in PR review, branch off `phase-2` directly:
|
||||
|
||||
```bash
|
||||
# Option A: after main has phase-2 merged
|
||||
git checkout main && git pull && git checkout -b phase-2.5
|
||||
|
||||
# Option B: continue from phase-2 directly
|
||||
git checkout phase-2 && git pull && git checkout -b phase-2.5
|
||||
```
|
||||
|
||||
**Schema baseline:** Phase 2 leaves the DB at version 8. Phase 2.5 adds **no migrations**. Schema-version assertion in `tests/test_world.py` stays at 8.
|
||||
|
||||
**Relationship to Phase 3:** Phase 3 (`phase-3` branch, plan committed but not yet executed) uses task ids T49–T67. Phase 2.5 uses **T68–T75** to avoid collision regardless of merge order.
|
||||
|
||||
**Pinned non-negotiables (carried forward from Phases 1 + 2):**
|
||||
|
||||
- State changes go through the event log. Use `append_and_apply(conn, kind, payload)` for the live path; `apply_event` only after a fresh `append_event` returning the new id.
|
||||
- Witness filter every memory read at SQL level (hard `WHERE` constraint; never a soft signal).
|
||||
- Edges are directed; `botA → botB` and `botB → botA` are independent records.
|
||||
- Per-POV scene summaries — never write omniscient narration.
|
||||
- TDD: every task starts with a failing test (or, for refactors that preserve behavior, a regression test that pins the existing contract before any change).
|
||||
- One commit per task minimum. Tasks that bundle 3+ small backlog items SHOULD split commits within the task — one commit per backlog item — so review can bisect cleanly.
|
||||
|
||||
**Verification before claiming done:** Use `superpowers-extended-cc:verification-before-completion` — run the test command, paste actual output. Don't assume green.
|
||||
|
||||
---
|
||||
|
||||
## Backlog item → task mapping
|
||||
|
||||
15 items consolidated into 8 tasks by **file ownership** (so each wave's tasks stay file-disjoint). Bundled tasks may split commits internally.
|
||||
|
||||
| # | Backlog item | Source | Task |
|
||||
|---|--------------|--------|------|
|
||||
| 1 | `open_db` refactor with `check_same_thread` parameter | Phase 1.5 | **T68** |
|
||||
| 2 | Regenerate broadcasts `turn_html` over SSE | Phase 1.5 | **T73** |
|
||||
| 3 | `bot_reset` purges orphaned "you" activity rows | Phase 1.5 | **T69** |
|
||||
| 4 | Drawer edits for deferred v1 fields (edge_trust, edge_summary, memory pov_summary, knowledge_facts) | Phase 1.5 | **T72** |
|
||||
| 5 | NICE trim order in prompt assembly | Phase 1.5 | **T71** |
|
||||
| 6 | Interjection regenerate | Phase 2.5 | **T73** |
|
||||
| 7 | Classifier-based addressee detection | Phase 2.5 | **T74** |
|
||||
| 8 | LLM-merged group meta-summary | Phase 2.5 | **T70** |
|
||||
| 9 | First-meeting gate (drawer "have they met?" toggle) | Phase 2.5 | **T72** |
|
||||
| 10 | Witness flag editing in drawer | Phase 2.5 | **T72** |
|
||||
| 11 | Significance for interjection memories | Phase 2.5 | **T74** |
|
||||
| 12 | Stale guest reference defensive degrade removal | Phase 2.5 | **T73 + T74** (split by file) |
|
||||
| 13 | Scene close on cancel review | Phase 2.5 | **T74** |
|
||||
| 14 | Dual `ACTIVITIES:` block consolidation | Phase 2.5 | **T71** |
|
||||
| 15 | Witness role hardcode in prompt assembly | Phase 2.5 | **T71** |
|
||||
| — | Docs sweep — remove shipped items from CLAUDE.md | (this plan) | **T75** |
|
||||
|
||||
---
|
||||
|
||||
## Parallel-Execution Strategy
|
||||
|
||||
Same pattern as Phases 2 and 3. Five waves: parallel within each wave (file-disjoint), serial across waves. Cross-wave merges keep `phase-2.5` green between dispatches.
|
||||
|
||||
### How to dispatch a wave in parallel
|
||||
|
||||
Use the **Agent tool with `isolation: "worktree"`** so each subagent gets its own git worktree. (If the controlling session's working directory is **not** the chat repo, create worktrees manually with `git worktree add .worktrees/<wave>-<task> -b <wave>/<task> phase-2.5` from inside the chat repo and pass the worktree path explicitly into each subagent prompt — that is the pattern Phase 2 used.)
|
||||
|
||||
In a single message, dispatch all tasks in the wave:
|
||||
|
||||
```
|
||||
Agent({
|
||||
description: "Wave 1 — T68 open_db refactor",
|
||||
subagent_type: "general-purpose",
|
||||
isolation: "worktree",
|
||||
prompt: "<full task text from below>",
|
||||
})
|
||||
Agent({ ...T69... })
|
||||
Agent({ ...T70... })
|
||||
```
|
||||
|
||||
### After a wave completes
|
||||
|
||||
1. Each subagent returns its worktree path and commit SHA(s).
|
||||
2. **Run a spec + code-quality reviewer subagent on each completed task.** Combined review is acceptable for purely mechanical refactors (T68, T69); separate spec + quality reviewers for tasks that bundle multiple backlog items (T71, T72, T74).
|
||||
3. **Merge the wave into `phase-2.5`** in any order (file-disjointness guarantees no conflict). Use `--no-ff`:
|
||||
|
||||
```bash
|
||||
git checkout phase-2.5
|
||||
for branch in <wave-branches>; do
|
||||
git merge --no-ff "$branch" -m "merge: <task description>"
|
||||
done
|
||||
```
|
||||
|
||||
4. **Run the full test suite** on the merged `phase-2.5`. If it's red, the wave's mutual-independence assumption was violated — bisect the offending pair, fix, re-merge.
|
||||
5. **Push `phase-2.5`** to gitea so the work is durable before the next wave starts.
|
||||
6. Optionally clean up worktrees: `git worktree remove .worktrees/<branch>` and `git branch -D <branch>`.
|
||||
|
||||
### Conflict prevention checklist (apply before dispatch)
|
||||
|
||||
For each parallel wave, verify the **Files** sections of all tasks have **no overlapping paths**. The waves below are designed to satisfy this; if you decide to add or merge tasks, re-check.
|
||||
|
||||
The hot files in this plan are: `chat/web/turns.py`, `chat/services/regenerate.py`, `chat/web/drawer.py`, `chat/templates/_drawer.html`, `chat/services/prompt.py`. Each is owned by exactly one task in this plan.
|
||||
|
||||
### Failure recovery
|
||||
|
||||
If one subagent fails: cancel it, merge the others' successful work, re-dispatch the failed task as a single follow-up. Don't block the wave.
|
||||
|
||||
If a failure exposes a bad assumption shared by multiple tasks (e.g., a refactor that requires a wider blast radius than the plan accounted for), pause the wave and revisit.
|
||||
|
||||
### Why each wave is parallel-safe
|
||||
|
||||
| Wave | Tasks | Hot files touched | Disjoint? |
|
||||
|------|-------|-------------------|-----------|
|
||||
| 1 | T68, T69, T70 | `chat/db/connection.py` + `chat/web/bots.py` (T68); `chat/state/entities.py` (T69); `chat/services/scene_summarize.py` (T70) | ✅ |
|
||||
| 2 | T71 | `chat/services/prompt.py` | (single task) |
|
||||
| 3 | T72 | `chat/web/drawer.py` + `chat/templates/_drawer.html` | (single task) |
|
||||
| 4 | T73, T74 | `chat/services/regenerate.py` (T73); `chat/web/turns.py` + new `chat/services/addressee.py` (T74) | ✅ |
|
||||
| 5 | T75 | `CLAUDE.md` | (single task) |
|
||||
|
||||
---
|
||||
|
||||
## Task overview
|
||||
|
||||
```
|
||||
Wave 1 ─┬─ T68: open_db refactor with check_same_thread param
|
||||
├─ T69: bot_reset purges orphaned "you" activity rows
|
||||
└─ T70: LLM-merged group meta-summary
|
||||
|
||||
Wave 2 ─── T71: prompt.py polish (NICE trim order + dual ACTIVITIES + witness role parametric)
|
||||
|
||||
Wave 3 ─── T72: drawer.py polish (deferred v1 edits + first-meeting gate + witness flag editing)
|
||||
|
||||
Wave 4 ─┬─ T73: regenerate.py polish (turn_html SSE + interjection regenerate + stale-guest cleanup)
|
||||
└─ T74: turn-flow polish + addressee service (classifier addressee detection +
|
||||
significance for interjection + scene close on cancel + stale-guest cleanup)
|
||||
|
||||
Wave 5 ─── T75: docs sweep — remove shipped items from CLAUDE.md backlogs
|
||||
```
|
||||
|
||||
Critical path: 5 sequential merge points. Total tasks: 8. Wall-clock parallelism advantage: Waves 1 and 4 dispatch concurrently; Waves 2, 3, 5 are single-task by file constraint.
|
||||
|
||||
---
|
||||
|
||||
## Wave 1 — Independent small fixes (parallel)
|
||||
|
||||
Three tasks, fully file-disjoint.
|
||||
|
||||
### Task 68: `open_db` refactor with `check_same_thread` parameter
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/db/connection.py` (extend `open_db(path, *, check_same_thread=True)` so callers can opt out of SQLite's main-thread requirement)
|
||||
- Modify: `chat/web/bots.py` (use the new parameter in `get_conn` rather than hand-rolling its own context-manager body)
|
||||
- Modify: tests in `tests/test_connection.py` (or wherever `open_db` is tested; add 1 test for the new parameter)
|
||||
|
||||
**Spec:** Currently `chat/web/bots.py:get_conn()` duplicates the body of `open_db` so it can pass `check_same_thread=False`. Extend `open_db` to accept this as a kwarg (default True, preserving existing behavior). Then have `get_conn` call `open_db(...)` directly. The PRAGMA setup (WAL, foreign_keys, synchronous, etc.) stays in one place.
|
||||
|
||||
**Step 1: failing test** — add a regression test that pins the existing contract:
|
||||
|
||||
```python
|
||||
def test_open_db_default_uses_check_same_thread_true(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
# Default is check_same_thread=True; calling from another thread should fail.
|
||||
...
|
||||
|
||||
def test_open_db_can_disable_check_same_thread(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db, check_same_thread=False) as conn:
|
||||
# Same conn callable from another thread now.
|
||||
...
|
||||
```
|
||||
|
||||
**Step 3: implementation** — add `check_same_thread: bool = True` to `open_db`. Pass through to `sqlite3.connect`. Then in `chat/web/bots.py`, replace the duplicated context-manager body with `open_db(path, check_same_thread=False)`.
|
||||
|
||||
**Step 5: commit** — `refactor: open_db with check_same_thread parameter (T68)`.
|
||||
|
||||
**Notes for implementer:**
|
||||
|
||||
- This is a refactor — the full test suite must be GREEN before AND after. Run before to baseline, run after to confirm no regressions. Pay special attention to `tests/test_bots.py` if it exercises the `get_conn` path.
|
||||
- Do NOT change the default. Existing callers don't pass `check_same_thread` and must continue to get `True`.
|
||||
|
||||
---
|
||||
|
||||
### Task 69: `bot_reset` purges orphaned "you" activity rows
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/state/entities.py` (extend `_apply_bot_reset` with one more `DELETE` clause for "you" activity rows tied to chats that this bot hosted)
|
||||
- Modify: tests in `tests/test_reset.py` (add 2 tests)
|
||||
|
||||
**Spec:** Currently `_apply_bot_reset` purges the bot's chats, the bot's own activity rows, the bot's memories, and edges involving the bot. Phase 2 T47 added a `chats.guest_bot_id` cascade. Still missing: when bot A's chats are deleted, "you"-owned activity rows that were associated with those chats' containers are not cleaned up. They linger as orphaned activity entries pointing at deleted containers.
|
||||
|
||||
The fix per the existing CLAUDE.md note:
|
||||
|
||||
```sql
|
||||
DELETE FROM activity
|
||||
WHERE entity_id = 'you'
|
||||
AND container_id IN (SELECT id FROM containers WHERE chat_id IN (
|
||||
SELECT id FROM chats WHERE host_bot_id = ?
|
||||
));
|
||||
```
|
||||
|
||||
Order matters: this `DELETE` must run BEFORE the `DELETE FROM containers` and `DELETE FROM chats` clauses — otherwise the subqueries return no rows. Verify ordering in the existing handler before placing the new line.
|
||||
|
||||
**Tests:** 2 added.
|
||||
|
||||
1. `test_reset_purges_orphaned_you_activity_rows`: seed bot_a, chat_bot_a, a container in chat_bot_a, and a "you" activity row pointing at that container. Reset bot_a. Assert `SELECT COUNT(*) FROM activity WHERE entity_id = 'you'` is 0.
|
||||
2. `test_reset_does_not_purge_you_activity_in_other_chats`: seed bot_a + bot_b, both with chats and "you" activity in each. Reset bot_a. Assert "you" activity in chat_bot_a is gone, but "you" activity in chat_bot_b is preserved.
|
||||
|
||||
**Commit:** `fix: bot_reset purges orphaned 'you' activity rows (T69)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 70: LLM-merged group meta-summary
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/services/scene_summarize.py` (replace the naive `f"{host_name}: {host_summary}\n\n{guest_name}: {guest_summary}"` with an LLM-merged group view via a new classifier wrapper)
|
||||
- Modify: tests in `tests/test_per_pov_summary.py` (replace the regression test for naive concat with one that asserts the merged text uses the classifier output; keep the existing per-POV memory tests intact)
|
||||
|
||||
**Spec:** Phase 2 T45 wrote a stub for `group_node.summary` that just concatenated the two per-POV summaries. Replace it with a small classifier call that produces a coherent group-level summary from both POVs.
|
||||
|
||||
Add a new helper at the bottom of `scene_summarize.py`:
|
||||
|
||||
```python
|
||||
class GroupMetaSummary(BaseModel):
|
||||
summary: str = ""
|
||||
dynamic: str = ""
|
||||
|
||||
async def merge_group_summary(
|
||||
client: LLMClient,
|
||||
*,
|
||||
classifier_model: str,
|
||||
host_name: str,
|
||||
host_pov_summary: str,
|
||||
guest_name: str,
|
||||
guest_pov_summary: str,
|
||||
timeout_s: float = 30.0,
|
||||
) -> GroupMetaSummary:
|
||||
"""Merge two per-POV scene summaries into a coherent group-level
|
||||
summary + group-dynamic note. Falls back to the naive concat on
|
||||
classifier failure."""
|
||||
```
|
||||
|
||||
System prompt: "Given two per-POV scene summaries from a 3-entity scene (you + host + guest), produce a coherent group-level summary capturing the shared events as both witnesses experienced them, plus a brief 'dynamic' note describing the trio's group dynamic during the scene." Output strict JSON matching schema. Default = `GroupMetaSummary(summary=f"{host_name}: {host_pov_summary}\n\n{guest_name}: {guest_pov_summary}", dynamic="")` (the existing naive concat preserved as fallback so a classifier failure doesn't degrade behavior).
|
||||
|
||||
In `apply_scene_close_summary`, replace the naive concat call site (the existing `summary=` kwarg of the `group_node_updated` event) with `await merge_group_summary(...)` and use its `.summary` and `.dynamic` outputs.
|
||||
|
||||
**Tests:** 3 in `tests/test_per_pov_summary.py`.
|
||||
|
||||
1. `test_group_summary_merges_per_pov_via_classifier_when_guest_present`: mock the classifier with `GroupMetaSummary(summary="merged summary", dynamic="warm rapport")`. Close a scene with guest. Assert `get_group_node(...).summary == "merged summary"` and `.dynamic == "warm rapport"`.
|
||||
2. `test_group_summary_falls_back_to_naive_concat_on_classifier_failure`: mock classifier with bad JSON across all 3 retries. Close scene. Assert `summary` matches the old naive concat format. `dynamic` is empty.
|
||||
3. `test_group_summary_skipped_when_no_guest`: no-guest path unchanged — `group_node_updated` not emitted at all (existing behavior).
|
||||
|
||||
**Commit:** `feat: LLM-merged group meta-summary (T70)`.
|
||||
|
||||
---
|
||||
|
||||
## Wave 2 — `prompt.py` polish (single task)
|
||||
|
||||
T71 bundles three prompt-assembly cleanups. All touch `chat/services/prompt.py`. Single task because the file is hot; the implementer SHOULD split into 3 commits within the task for clean review bisection.
|
||||
|
||||
### Task 71: prompt.py polish (NICE trim order + dual ACTIVITIES + witness role parametric)
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/services/prompt.py`
|
||||
- Modify: `tests/test_prompt.py` (add tests; preserve existing 10 tests)
|
||||
|
||||
**Spec:** Three independent cleanups bundled because the file is hot.
|
||||
|
||||
#### 71.1 — Witness role parametric (Phase 2.5 backlog #15)
|
||||
|
||||
`chat/services/prompt.py:436` (or wherever the call site is — verify) calls `search_memories(conn, speaker_bot_id, "host", query, k=4)` with `witness_role="host"` hardcoded. This is wrong when the speaker is the guest (the guest queries with `witness_role="guest"` should hit a different SQL filter).
|
||||
|
||||
Fix: derive the role from chat membership.
|
||||
|
||||
```python
|
||||
def _witness_role_for(speaker_bot_id: str, host_bot_id: str) -> str:
|
||||
return "host" if speaker_bot_id == host_bot_id else "guest"
|
||||
```
|
||||
|
||||
Apply at the call site. The test contract is already pinned in `tests/test_witness_filter_multi.py` from Phase 2 T46 — those tests will continue to pass; this change unblocks guest-as-speaker in production.
|
||||
|
||||
**Commit:** `fix: witness role parametric in prompt assembly (T71.1)`.
|
||||
|
||||
#### 71.2 — Dual `ACTIVITIES:` block consolidation (Phase 2.5 backlog #14)
|
||||
|
||||
T43 (Phase 2) added a second `ACTIVITIES:` block to render guest activity separately from you+speaker activity (so the trim ladder could drop guest activity first under tight budget). Two consecutive `ACTIVITIES:` headers can read as a duplicate-section bug to the LLM.
|
||||
|
||||
Refactor to a single `ACTIVITIES:` block with three bullets (you, speaker, guest), where each bullet is independently trimmable: under tight budget, drop the guest bullet first, then the you bullet, keeping the speaker bullet (the speaker's own current activity is MUST-tier).
|
||||
|
||||
Implementation: the existing trim machinery uses block-level granularity. Extend it to bullet-level granularity for this block (one new helper or one new tier name like `MUST-bullet` / `SHOULD-bullet` / `NICE-bullet` — pick whichever is least disruptive).
|
||||
|
||||
**Commit:** `refactor: single ACTIVITIES: block with bullet-level trim (T71.2)`.
|
||||
|
||||
#### 71.3 — NICE trim order revisit (Phase 1.5 backlog #5)
|
||||
|
||||
Per T18 review: the NICE trim drops previous-scene first instead of last (the spec listing order was previous-scene last). Greedy-cuts heuristic vs. spec.
|
||||
|
||||
Revisit: review the trim ordering carefully. If real play surfaces a regression (the previous-scene block is genuinely important to bot continuity), reverse the NICE order so previous-scene drops last. If not, document the intentional deviation in a code comment and call it done.
|
||||
|
||||
**This is a judgment call.** Default action: leave the order as-is and add a comment explaining why (the heuristic is "drop the cheapest-impact thing first; greedy lookahead is more expensive than the marginal narrative loss"). If review feedback during execution disagrees, reverse the order.
|
||||
|
||||
**Commit:** `chore: document NICE trim order rationale (T71.3)` OR `fix: NICE trim order drops previous-scene last (T71.3)`.
|
||||
|
||||
#### Tests for T71
|
||||
|
||||
Add to `tests/test_prompt.py`:
|
||||
|
||||
1. `test_speaker_is_guest_uses_guest_witness_role`: speaker=guest_id. Patch `search_memories` to record its `witness_role` argument. Assert called with `"guest"`, not `"host"`.
|
||||
2. `test_single_activities_block_with_three_bullets_when_3_entities`: 3-entity prompt. Assert exactly one `ACTIVITIES:` header present. Assert bullets for you, speaker, guest.
|
||||
3. `test_tight_budget_drops_guest_activity_bullet_first`: 3-entity prompt with budget tight enough to force trim. Assert speaker activity bullet survives, guest activity bullet is dropped.
|
||||
4. (Optional, depends on 71.3 outcome) `test_nice_trim_order_drops_previous_scene_last`: only add if you choose to fix the order.
|
||||
|
||||
**Verification gates:**
|
||||
|
||||
- `pytest tests/test_prompt.py -v` — 10 existing + 3-4 new all pass.
|
||||
- `pytest tests/test_witness_filter_multi.py -v` — Phase 2 T46 tests still pass (proves the witness-role fix didn't break anything).
|
||||
- Full suite green.
|
||||
|
||||
---
|
||||
|
||||
## Wave 3 — `drawer.py` polish (single task)
|
||||
|
||||
T72 bundles three drawer affordances. All touch `chat/web/drawer.py` and `chat/templates/_drawer.html`. Single task by file constraint; implementer SHOULD split into 3 commits.
|
||||
|
||||
### Task 72: drawer polish (deferred v1 edits + first-meeting gate + witness flag editing)
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/web/drawer.py` (add 4-5 new POST routes for the deferred v1 edits + 1 GET extension for first-meeting gate + 1 POST for witness flag editing)
|
||||
- Modify: `chat/templates/_drawer.html` (forms for each new edit affordance)
|
||||
- Create: `tests/test_drawer_edits_extended.py` (new tests for the new routes; existing `tests/test_drawer_edits.py` and `tests/test_drawer_guest.py` stay unchanged)
|
||||
|
||||
**Spec:** Three independent backlog items.
|
||||
|
||||
#### 72.1 — Deferred v1 drawer edits (Phase 1.5 backlog #4)
|
||||
|
||||
The `manual_edit` projector already supports `target_kind` values for `edge_trust`, `edge_summary`, `memory_pov_summary`. These work end-to-end at the state layer; only the drawer routes are missing.
|
||||
|
||||
Add 4 new POST routes:
|
||||
|
||||
1. `POST /chats/{chat_id}/drawer/edge/trust` — form `{source_id, target_id, new_value}` (0–100 int). Appends `manual_edit` with `target_kind="edge_trust"`, `prior_value=current_trust`, `new_value=...`. Validate range; 400 on out-of-bounds.
|
||||
2. `POST /chats/{chat_id}/drawer/edge/summary` — form `{source_id, target_id, new_summary}` (text). Appends `manual_edit` with `target_kind="edge_summary"`. No validation beyond non-empty + reasonable length cap (e.g., 2000 chars).
|
||||
3. `POST /chats/{chat_id}/drawer/memory/pov-summary` — form `{memory_id, new_summary}`. Appends `manual_edit` with `target_kind="memory_pov_summary"`. 404 if memory not in this chat or not owned by a present bot.
|
||||
4. `POST /chats/{chat_id}/drawer/edge/knowledge-facts` — form `{source_id, target_id, action: 'add'|'remove', fact: str}`. Knowledge_facts needs a NEW dispatch branch in the `manual_edit` projector — add it as part of this task: `target_kind="edge_knowledge_fact"` with payload action + fact.
|
||||
|
||||
The existing drawer template has read-only renders for these fields. Replace with editable forms (textarea + slider + button).
|
||||
|
||||
Tests in `tests/test_drawer_edits_extended.py`:
|
||||
|
||||
- One test per route (4 tests minimum) asserting: the manual_edit event lands; the projected state changes; the response contains the updated drawer partial.
|
||||
|
||||
**Commit:** `feat: drawer edits for edge_trust / edge_summary / memory_pov_summary / knowledge_facts (T72.1)`.
|
||||
|
||||
#### 72.2 — First-meeting gate (Phase 2.5 backlog #9)
|
||||
|
||||
The "Add guest" form's `relationship_prose` textarea fires every time. In Phase 2 T42's notes: "fire it every time a `(host, guest)` pair has no existing `host → guest` edge."
|
||||
|
||||
Implement the gate: when the user opens the Add-guest form, check whether `get_edge(conn, host_bot_id, guest_bot_id)` already exists. If yes:
|
||||
|
||||
- Render the textarea disabled with the message "they already know each other (edge exists from a prior chat)" + a small "re-seed anyway" toggle that re-enables the textarea.
|
||||
- If the user submits without toggling, skip the relationship-seed call (existing edge content stays).
|
||||
- If the user toggles re-seed and submits prose, the existing flow runs — `seed_inter_bot_edges` produces deltas, two `edge_update` events fire on top of the existing edge content.
|
||||
|
||||
Tests:
|
||||
|
||||
1. `test_add_guest_form_disables_prose_when_edge_exists`: pre-seed a host→guest edge from a prior chat; render the form; assert the textarea has `disabled` attribute AND the "they already know each other" message is in the body.
|
||||
2. `test_add_guest_with_existing_edge_skips_seed_call`: pre-seed edge; submit form without toggling re-seed; assert classifier mock was NOT called (count check on canned-response queue).
|
||||
|
||||
**Commit:** `feat: first-meeting gate on drawer Add-guest form (T72.2)`.
|
||||
|
||||
#### 72.3 — Witness flag editing (Phase 2.5 backlog #10)
|
||||
|
||||
Memories show witness flags `[you, host, guest]` read-only in the drawer. Add an inline-edit affordance: each flag becomes a checkbox; toggling submits a `manual_edit` event with `target_kind="memory_witness"`, payload `{memory_id, flag: 'you'|'host'|'guest', new_value: bool}`.
|
||||
|
||||
The `manual_edit` projector needs a new dispatch branch for `memory_witness` — same as the knowledge_facts branch in 72.1; do them together if cleaner.
|
||||
|
||||
Tests: 2.
|
||||
|
||||
1. `test_witness_flag_toggle_updates_memory_row`: seed memory with witness `[1, 1, 0]`. POST toggle on `guest` flag → 1. Project. Assert `memories.witness_guest = 1`.
|
||||
2. `test_witness_flag_toggle_emits_manual_edit_event`: same setup; assert the manual_edit event has the right `target_kind` and `prior_value`/`new_value`.
|
||||
|
||||
**Commit:** `feat: drawer witness flag inline-edit (T72.3)`.
|
||||
|
||||
---
|
||||
|
||||
## Wave 4 — Turn-flow polish (parallel)
|
||||
|
||||
Two tasks, file-disjoint. T73 owns `chat/services/regenerate.py`; T74 owns `chat/web/turns.py` + adds a new addressee-detection service.
|
||||
|
||||
Each task bundles multiple backlog items. Implementer should split commits within each task.
|
||||
|
||||
### Task 73: `regenerate.py` polish
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/services/regenerate.py`
|
||||
- Modify: `tests/test_regenerate.py` (add tests; existing tests preserved)
|
||||
|
||||
**Spec:** Three regenerate-related backlog items.
|
||||
|
||||
#### 73.1 — Regenerate broadcasts `turn_html` over SSE (Phase 1.5 backlog #2)
|
||||
|
||||
After the new `assistant_turn` lands, broadcast a `turn_html` event over the chat's pub/sub channel — mirror the broadcast logic in `chat/web/turns.py:post_turn`. The existing `post_turn` does this via `publish(chat_id, {"event": "turn_html", "html": ...})` (or similar — verify). Use the same render path so connected tabs swap the regenerated turn live, no refresh required.
|
||||
|
||||
Test: `test_regenerate_broadcasts_turn_html_over_sse` — mock `publish` and assert it was called with the new `assistant_turn`'s rendered HTML.
|
||||
|
||||
**Commit:** `feat: regenerate broadcasts turn_html over SSE (T73.1)`.
|
||||
|
||||
#### 73.2 — Interjection regenerate (Phase 2.5 backlog #6)
|
||||
|
||||
Phase 2 T44 deferred interjection regenerate: regenerate currently only acts on the addressee turn. Extend so that when a turn group has both a primary `assistant_turn` and an `assistant_turn` flagged as `interjection_of=...`, regenerate redoes BOTH — the primary first, then the interjection (using the same interjection-decision classifier path as `post_turn`). The interjection branch may decide `should_interject=False` on the regenerate, in which case the previous interjection_turn is superseded but no new interjection is appended.
|
||||
|
||||
Test: `test_regenerate_with_interjection_redoes_both_turns` — seed a 3-entity scene with a prior primary + interjection; regenerate; assert two new assistant_turns land (or one new + a supersede-without-replace if the regenerated decision was "no interjection").
|
||||
|
||||
**Commit:** `feat: regenerate covers interjection turns (T73.2)`.
|
||||
|
||||
#### 73.3 — Stale-guest defensive degrade cleanup in regenerate.py (Phase 2.5 backlog #12, partial)
|
||||
|
||||
Phase 2 T44 added a defensive degrade-to-1:1 in `regenerate.py` when `chat.guest_bot_id` points at a deleted bot. T47 fixed the root cause (resets clear the reference). The defensive degrade is now dead code.
|
||||
|
||||
Remove the degrade block; let the function trust that `chat.guest_bot_id` is either valid or NULL. The corresponding existing test for the defensive degrade can be removed (the bot_reset cascade test in `tests/test_reset.py` already covers the root-cause behavior).
|
||||
|
||||
**Commit:** `chore: remove defensive stale-guest degrade in regenerate.py (T73.3)`.
|
||||
|
||||
#### Verification gates
|
||||
|
||||
- `pytest tests/test_regenerate.py -v` — existing + new all pass.
|
||||
- Full suite green.
|
||||
|
||||
---
|
||||
|
||||
### Task 74: turn-flow polish + new addressee-detection service
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/web/turns.py`
|
||||
- Create: `chat/services/addressee.py` (new classifier wrapper for addressee detection)
|
||||
- Create: `tests/test_addressee.py`
|
||||
- Modify: `tests/test_turn_flow.py` (add tests; existing 8 tests preserved)
|
||||
|
||||
**Spec:** Four turn-flow backlog items.
|
||||
|
||||
#### 74.1 — Classifier-based addressee detection (Phase 2.5 backlog #7)
|
||||
|
||||
Phase 2 T44's `_detect_addressee_id` uses a substring whole-word regex match. This is brittle: bot names that are common English words (e.g., a bot named "Sam"), names appearing inside a quoted aside ("Did you see what Sam wrote in his letter?" — addressed to host, not Sam), or fuzzy references all break it.
|
||||
|
||||
Replace with a small classifier call. New module `chat/services/addressee.py`:
|
||||
|
||||
```python
|
||||
class AddresseeDecision(BaseModel):
|
||||
addressee_id: str # bot id, "you", or "host" as fallback
|
||||
confidence: str = "medium" # "high" | "medium" | "low"
|
||||
reason: str = ""
|
||||
|
||||
async def detect_addressee(
|
||||
client: LLMClient,
|
||||
*,
|
||||
classifier_model: str,
|
||||
user_prose: str,
|
||||
host_id: str,
|
||||
host_name: str,
|
||||
guest_id: str | None,
|
||||
guest_name: str | None,
|
||||
timeout_s: float = 30.0,
|
||||
) -> AddresseeDecision:
|
||||
"""Classify which present bot the user is addressing in this turn.
|
||||
Defaults to host on failure or low confidence."""
|
||||
```
|
||||
|
||||
System prompt: "Given a user's turn prose and the names of present bots, decide which bot the user is addressing. If the user is speaking to no specific bot (descriptive narration, action without dialogue), default to the host. Output strict JSON."
|
||||
|
||||
Default fallback (classifier failure) = `AddresseeDecision(addressee_id=host_id, confidence="low", reason="fallback")`.
|
||||
|
||||
In `chat/web/turns.py`, replace `_detect_addressee_id` with a call to `detect_addressee`. Keep the substring helper as a low-confidence pre-filter for the no-guest case (no LLM call needed when only one bot is present — preserves throughput).
|
||||
|
||||
Tests:
|
||||
|
||||
- `tests/test_addressee.py` (new file): 3 tests — classifier returns guest, classifier returns host, classifier failure falls back to host.
|
||||
- `tests/test_turn_flow.py`: update `test_addressee_detection_routes_to_named_bot` from Phase 2 T44 to use the new classifier path. (Existing test should keep passing with the new mock orchestration; canned-response queue may need an extra slot for the addressee decision.)
|
||||
|
||||
**Commit:** `feat: classifier-based addressee detection (T74.1)`.
|
||||
|
||||
#### 74.2 — Significance for interjection memories (Phase 2.5 backlog #11)
|
||||
|
||||
Phase 2 T44 noted: the interjection branch's `memory_written` event doesn't enqueue a `SignificanceJob`. Wire it in: after the interjection memory write (the `record_turn_memory_for_present` call in the interjection branch), enqueue a `SignificanceJob` with the interjection's host memory id (mirror the primary turn's enqueue at the end of the primary branch).
|
||||
|
||||
If both host and guest memory ids exist for the interjection (as they will when both are present), enqueue once for the host id (the existing pattern for primary turns — the score applies to both POVs since the prose is identical at the time of write).
|
||||
|
||||
Test: `test_interjection_enqueues_significance_job` — mock the worker; trigger an interjection; assert `SignificanceJob` was enqueued with the interjection memory id.
|
||||
|
||||
**Commit:** `fix: enqueue significance for interjection memories (T74.2)`.
|
||||
|
||||
#### 74.3 — Scene close on cancel review (Phase 2.5 backlog #13)
|
||||
|
||||
Phase 2 T44 review noted: when a primary turn is cancelled mid-stream, scene close still runs. Behavior may be intentional (close detection looks at user prose, not bot output) or wrong (a cancelled turn is incomplete; closing the scene on it is premature).
|
||||
|
||||
**Decision for this task:** review the call path. If the close detection truly only consults user prose AND the user prose is fully present at the moment of cancel (it is — user prose is appended before the stream starts), the existing behavior is correct: a cancelled turn doesn't invalidate the user's intent to close the scene. Document this in a code comment near the close-detection branch.
|
||||
|
||||
If a play-test surfaces a regression (e.g., a user cancels because the bot misread their close intent), revisit. Default: document and close as a no-op.
|
||||
|
||||
Test: `test_cancelled_turn_still_closes_scene_when_user_prose_signals_close` — pin the existing behavior so a future refactor doesn't quietly change it.
|
||||
|
||||
**Commit:** `chore: pin scene-close-on-cancel behavior + comment rationale (T74.3)`.
|
||||
|
||||
#### 74.4 — Stale-guest defensive degrade cleanup in turns.py (Phase 2.5 backlog #12, partial)
|
||||
|
||||
Same as T73.3 but for `chat/web/turns.py`: T44's defensive degrade-to-1:1 in `post_turn` (lines 235-242 per the T44 implementer note) is dead code now that T47 fixed the root cause. Remove it.
|
||||
|
||||
**Commit:** `chore: remove defensive stale-guest degrade in turns.py (T74.4)`.
|
||||
|
||||
#### Verification gates
|
||||
|
||||
- `pytest tests/test_addressee.py -v` — 3/3 new tests pass.
|
||||
- `pytest tests/test_turn_flow.py -v` — existing 8 + new 2-3 all pass.
|
||||
- `pytest tests/test_reset.py -v` — Phase 2 T47 root-cause cascade still green.
|
||||
- Full suite green.
|
||||
|
||||
---
|
||||
|
||||
## Wave 5 — Docs sweep (single task)
|
||||
|
||||
### Task 75: Remove shipped items from CLAUDE.md backlogs
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `CLAUDE.md`
|
||||
|
||||
**Spec:** Walk through the 15 backlog items in `CLAUDE.md` §"Phase 1.5 cleanup backlog" and §"Phase 2.5 / 3 backlog". For each item shipped during Phases 2.5 (T68–T74), remove it from the backlog list. Add a new section "Phase 2.5 status" near the existing "Phase 2 status" section listing what shipped:
|
||||
|
||||
- `open_db` refactor (T68).
|
||||
- `bot_reset` purges orphaned "you" activity rows (T69).
|
||||
- LLM-merged group meta-summary (T70).
|
||||
- Prompt assembly polish: witness role parametric, single ACTIVITIES block, NICE trim documented (T71).
|
||||
- Drawer edits for deferred v1 fields, first-meeting gate, witness flag editing (T72).
|
||||
- Regenerate over SSE + interjection regenerate + stale-guest cleanup (T73).
|
||||
- Classifier-based addressee detection + significance for interjection + scene-close-on-cancel pinned + stale-guest cleanup (T74).
|
||||
|
||||
If any task during execution chose NOT to ship a sub-item (e.g., T71.3 left NICE trim unchanged with a documented rationale), keep that sub-item in a "Phase 3.5+ deferred" section with the rationale. The goal is for the backlog list to reflect actual repo state, not aspirational scope.
|
||||
|
||||
If any new follow-ups were discovered during T68–T74 reviews, add them to the appropriate backlog section.
|
||||
|
||||
**Commit:** `docs: phase 2.5 status, prune shipped backlog items (T75)`.
|
||||
|
||||
---
|
||||
|
||||
## Wrap-up
|
||||
|
||||
After Wave 5 lands:
|
||||
|
||||
1. **Run full suite** on `phase-2.5`: should be ~225+ tests passing (212 from Phase 2 + ~15 new across the 8 tasks).
|
||||
2. **Manual smoke** (recommended before opening the PR):
|
||||
- Drawer: edit edge_trust on a chat; verify the new value sticks after refresh.
|
||||
- Drawer: edit edge_summary on a chat; refresh; verify.
|
||||
- Drawer: toggle a memory's witness flag; refresh; verify.
|
||||
- Drawer: open Add-guest form for a (host, guest) pair that already shares an edge; verify the gate disables the prose textarea.
|
||||
- Drawer: open Add-guest form for a fresh pair; verify the textarea is enabled.
|
||||
- Reset a bot; verify "you" activity rows for that bot's chats are gone (run `sqlite3 data/db.sqlite "SELECT * FROM activity WHERE entity_id='you'"` before/after).
|
||||
- Multi-tab: open two tabs on the same chat; click Regenerate on one; verify the other tab sees the new turn live (no refresh).
|
||||
- Trigger an interjection turn; check the worker queue or `significance_jobs` table; verify a job was enqueued for the interjection memory.
|
||||
- Use a bot with a name that's a common word ("Sam"); ask "did you see what Sam wrote?" — verify host gets the floor (classifier addressee detection, not substring).
|
||||
3. **Push `phase-2.5`** to gitea.
|
||||
4. **Open PR** `phase-2.5 → main`.
|
||||
5. **No new Phase 3+ backlog items expected** — if review surfaces any, add to CLAUDE.md.
|
||||
|
||||
---
|
||||
|
||||
## Notes for the controller running this plan
|
||||
|
||||
- **Don't dispatch Wave 4 until Wave 3 is merged AND tested green on `phase-2.5`.** T74 references the new addressee service path that's stand-alone, but the existing tests in `tests/test_turn_flow.py` may have shifted from Wave 3 if the drawer-test fixture interactions touch shared state. Verify green before fanning out.
|
||||
- **After each parallel wave**, run a code-review subagent (`subagent-driven-development` skill's two-stage review pattern) on each task. For purely mechanical tasks (T68, T69), combined spec+quality is acceptable. For bundled tasks (T71, T72, T74), use separate spec + quality reviewers — the surface area is larger.
|
||||
- **If Phase 3 (`phase-3` branch) is in flight in parallel**, T75 (the docs sweep) should land on `phase-2.5` only — Phase 3's docs sweep (T67) is independent. Both will resolve when the two branches merge to `main` in some order; expect a small CLAUDE.md merge to reconcile any overlapping backlog edits.
|
||||
- **If a task's "split commits" guidance proves impractical** (e.g., bundling means a test pins 3 fixes at once), one consolidated commit is acceptable. The split is an aid for review bisection, not a hard rule.
|
||||
- **Token-spend rough estimate**: Phase 2.5 should be ~50% the size of Phase 2 (smaller scope, all reuse). Per-task token spend similar to Phase 2's smaller tasks (T36, T37, T47).
|
||||
- **DO NOT break existing v1 / v2 surface contracts.** Every test file that was green at the start of Phase 2.5 must stay green at the end. The `tests/test_witness_filter_multi.py` contracts pinned in Phase 2 T46 are particularly load-bearing for T71.1 — verify them after the witness-role parametric fix lands.
|
||||
@@ -0,0 +1,15 @@
|
||||
{
|
||||
"planPath": "docs/plans/2026-04-26-v2.5-phase2.5-cleanup.md",
|
||||
"tasks": [
|
||||
{"id": 68, "subject": "T68: open_db refactor with check_same_thread parameter", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||
{"id": 69, "subject": "T69: bot_reset purges orphaned 'you' activity rows", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||
{"id": 70, "subject": "T70: LLM-merged group meta-summary", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||
{"id": 71, "subject": "T71: prompt.py polish (NICE trim + dual ACTIVITIES + witness role)", "status": "pending", "wave": 2, "parallelGroup": null},
|
||||
{"id": 72, "subject": "T72: drawer polish (deferred v1 edits + first-meeting gate + witness flag editing)", "status": "pending", "wave": 3, "parallelGroup": null},
|
||||
{"id": 73, "subject": "T73: regenerate.py polish (turn_html SSE + interjection regenerate + stale-guest cleanup)", "status": "pending", "wave": 4, "parallelGroup": "wave-4", "blockedBy": [72]},
|
||||
{"id": 74, "subject": "T74: turn-flow polish + addressee service (classifier addressee + significance interjection + scene close on cancel + stale-guest cleanup)", "status": "pending", "wave": 4, "parallelGroup": "wave-4", "blockedBy": [72]},
|
||||
{"id": 75, "subject": "T75: docs sweep — remove shipped items from CLAUDE.md", "status": "pending", "wave": 5, "parallelGroup": null, "blockedBy": [73, 74]}
|
||||
],
|
||||
"lastUpdated": "2026-04-26T00:00:00Z",
|
||||
"notes": "8 tasks across 5 waves consolidating 15 backlog items (5 from Phase 1.5, 10 from Phase 2.5/3). Waves 1 and 4 are parallel-safe (file-disjoint within each). Waves 2, 3, 5 are single-task by hot-file constraint (prompt.py, drawer.py, CLAUDE.md). Bundled tasks (T71, T72, T74) split into sub-commits per backlog item for clean review bisection. No schema migrations — schema baseline stays at version 8. Phase 3 plan uses T49-T67; this plan uses T68-T75 to avoid id collision regardless of merge order."
|
||||
}
|
||||
@@ -0,0 +1,891 @@
|
||||
# Roleplay Engine — Phase 3 Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use `superpowers-extended-cc:executing-plans` to implement this plan task-by-task. Use the parallel-dispatch pattern documented under "Parallel-Execution Strategy" for waves that fan out to multiple subagents.
|
||||
|
||||
**Goal:** Add events with lifecycles, time skips (elision + jump), active threads, significance/retrieval refinements, and "Meanwhile…" scenes (host+guest with no "you" present). All scoped to a single chat; the cross-chat surface remains unchanged.
|
||||
|
||||
**Architecture:** Builds on Phase 2's event-sourced architecture and 3-entity scene support. New event kinds (`event_planned`, `event_started`, `event_completed`, `event_cancelled`, `event_expired`, `time_skip_elision`, `time_skip_jump`, `thread_opened`, `thread_updated`, `thread_closed`, `meanwhile_scene_started`, `meanwhile_scene_closed`, `synthesized_memories`) carry the new state changes. Two new tables (`events`, `threads`) hold lifecycle state. Existing handlers (`memory_written`, `edge_update`) gain new payload sources without changes — promotion logic lives in services, not in projector handlers.
|
||||
|
||||
**Tech Stack:** Same as Phase 2 (Python 3.11+, FastAPI, HTMX, SQLite, Featherless). No new dependencies.
|
||||
|
||||
**Source-of-truth references:**
|
||||
|
||||
- Phase 3 scope: requirements doc §13 "Phase 3 — events, skips, threads"
|
||||
- Behavioral details: §4 (per-chat clocks), §6.3 (prompt assembly), §6.4 (drawer), §8.1 (retrieved-memory inputs), §9 ("Time, Skips, Events — Phase 3 surface"), §11 (significance & compression)
|
||||
- Conventions: [../../CLAUDE.md](../../CLAUDE.md) §"Behavioral defaults" + §"Phase 2 status"
|
||||
- Phase 2 plan (style, TDD pattern, parallel-dispatch mechanics): [2026-04-26-v2-phase2-implementation.md](2026-04-26-v2-phase2-implementation.md)
|
||||
|
||||
When a task says "see §X", that's the requirements doc unless stated otherwise.
|
||||
|
||||
---
|
||||
|
||||
## Pre-flight
|
||||
|
||||
**Branch:** create `phase-3` from the latest `main` after Phase 2 has merged. If Phase 2 is still in PR review, branch off `phase-2` directly:
|
||||
|
||||
```bash
|
||||
# Option A: after main has phase-2 merged
|
||||
git checkout main && git pull && git checkout -b phase-3
|
||||
|
||||
# Option B: continue from phase-2 directly
|
||||
git checkout phase-2 && git pull && git checkout -b phase-3
|
||||
```
|
||||
|
||||
**Schema baseline:** Phase 2 leaves the DB at version 8. Phase 3 adds two migrations: `0009_events.sql` and `0010_threads.sql`. No other migrations expected.
|
||||
|
||||
**Phase 2.5 backlog:** the items in CLAUDE.md §"Phase 2.5 / 3 backlog" are NOT scoped here — they should be cleaned up in a separate branch off `main` (suggested name `phase-2.5`) before or in parallel with Phase 3. None of them blocks Phase 3.
|
||||
|
||||
**Pinned non-negotiables (carried forward):**
|
||||
|
||||
- State changes go through the event log. Use `append_and_apply(conn, kind, payload)` for the live path; `apply_event` only after a fresh `append_event` returning the new id.
|
||||
- Witness filter every memory read at SQL level (hard `WHERE` constraint; never a soft signal).
|
||||
- Edges are directed; `botA → botB` and `botB → botA` are independent records.
|
||||
- Per-POV scene summaries — never write omniscient narration. (Meanwhile scenes write per-POV summaries for both present bots; you receive a digest later, not during the scene.)
|
||||
- TDD: every task starts with a failing test.
|
||||
- One commit per task minimum, more if it splits naturally.
|
||||
|
||||
**Verification before claiming done:** Use `superpowers-extended-cc:verification-before-completion` — run the test command, paste actual output. Don't assume green.
|
||||
|
||||
---
|
||||
|
||||
## Parallel-Execution Strategy
|
||||
|
||||
Same pattern as Phase 2. Eight waves: parallel within each wave (file-disjoint), serial across waves. The controller (you, the controlling Claude session) merges each subagent's commits and verifies the suite stays green before dispatching the next wave.
|
||||
|
||||
### How to dispatch a wave in parallel
|
||||
|
||||
Use the **Agent tool with `isolation: "worktree"`** so each subagent gets its own git worktree. The runtime cleans up the worktree automatically if no changes are made; otherwise it returns the path + branch for the controller to merge. (If the controlling session's working directory is **not** the chat repo, create worktrees manually with `git worktree add .worktrees/<wave>-<task> -b <wave>/<task> phase-3` from inside the chat repo and pass the worktree path explicitly into each subagent prompt — that is the pattern Phase 2 used.)
|
||||
|
||||
In a single message, dispatch all tasks in the wave:
|
||||
|
||||
```
|
||||
Agent({
|
||||
description: "Wave 1 — T49 events table + handlers",
|
||||
subagent_type: "general-purpose",
|
||||
isolation: "worktree",
|
||||
prompt: "<full task text from below>",
|
||||
})
|
||||
Agent({
|
||||
description: "Wave 1 — T50 time_skip handlers",
|
||||
subagent_type: "general-purpose",
|
||||
isolation: "worktree",
|
||||
prompt: "<full task text from below>",
|
||||
})
|
||||
Agent({
|
||||
description: "Wave 1 — T51 threads table + handlers",
|
||||
subagent_type: "general-purpose",
|
||||
isolation: "worktree",
|
||||
prompt: "<full task text from below>",
|
||||
})
|
||||
```
|
||||
|
||||
All subagents start simultaneously, each working on a private worktree branched off `phase-3`. They cannot see each other's changes (no shared filesystem state) — that's the safety guarantee.
|
||||
|
||||
### After a wave completes
|
||||
|
||||
1. Each subagent returns its worktree path and commit SHA.
|
||||
2. **Run a spec + code-quality reviewer subagent on each completed task** (combined review is acceptable for purely mechanical schema/handler tasks; large or integration tasks like T62, T63 deserve separate spec + quality reviewers).
|
||||
3. **Merge the wave into `phase-3`** in any order (file-disjointness guarantees no conflict). Use `--no-ff` so each task's history stays grouped:
|
||||
|
||||
```bash
|
||||
git checkout phase-3
|
||||
for branch in <wave-branches>; do
|
||||
git merge --no-ff "$branch" -m "merge: <task description>"
|
||||
done
|
||||
```
|
||||
|
||||
4. **Run the full test suite** on the merged `phase-3`. If it's red, the wave's mutual-independence assumption was violated — bisect to find the offending pair, fix in a follow-up commit, re-merge.
|
||||
5. **Push `phase-3`** to gitea so the work is durable before the next wave starts.
|
||||
6. Optionally clean up worktrees: `git worktree remove .worktrees/<branch>` and `git branch -D <branch>`.
|
||||
|
||||
### Conflict prevention checklist (apply before dispatch)
|
||||
|
||||
For each parallel wave, verify the **Files** sections of all tasks have **no overlapping paths**. The waves below are designed to satisfy this; if you decide to add or merge tasks, re-check.
|
||||
|
||||
If a hot file (`chat/web/turns.py`, `chat/services/prompt.py`, `chat/web/drawer.py`, `chat/templates/_drawer.html`, `chat/services/regenerate.py`) needs changes from multiple tasks, do **not** parallelize them — serialize within the wave or split into separate waves.
|
||||
|
||||
### Failure recovery
|
||||
|
||||
If one subagent fails (test failures, blocked, infinite loop):
|
||||
|
||||
- **Do not block the wave on a failure.** Cancel the failed subagent, merge the others' successful work, and re-dispatch the failed task as a single follow-up.
|
||||
- If a failure exposes a bad assumption shared by multiple tasks (e.g. an event-payload schema mismatch), pause the wave and revisit the plan.
|
||||
|
||||
### Why each wave is parallel-safe
|
||||
|
||||
| Wave | Tasks | Hot files touched | Disjoint? |
|
||||
|------|-------|-------------------|-----------|
|
||||
| 1 | T49, T50, T51 | new SQL migrations + new state modules; T50 also extends `chat/state/world.py` (additive) | ✅ |
|
||||
| 2 | T52, T53, T54, T55 | new service modules only | ✅ |
|
||||
| 3 | T56, T57, T58 | new service module (T56) + `chat/state/memory.py` retrieval extension (T57) + `chat/services/scene_summarize.py` (T58) | ✅ |
|
||||
| 4 | T59 | `chat/web/drawer.py`, `chat/templates/_drawer.html` | (single task) |
|
||||
| 5a | T60, T61 | `chat/services/prompt.py` (T60), `chat/web/turns.py` (T61) | ✅ |
|
||||
| 5b | T62 | `chat/web/turns.py`, plus a new skip route module | (single task; depends on 5a) |
|
||||
| 6 | T63, T64, T65 | meanwhile is tightly coupled — see Wave 6 sub-structure below | ⚠️ partial |
|
||||
| 7 | T66, T67 | new test file + docs only | ✅ |
|
||||
|
||||
**Wave 6 sub-structure:** T63 is schema/state (new files); T64 is service + extends `chat/web/turns.py`; T65 is service + extends `chat/services/prompt.py`. T64 and T65 are file-disjoint relative to each other but both depend on T63's schema landing first. Dispatch as: T63 alone → merge → T64+T65 in parallel → merge.
|
||||
|
||||
---
|
||||
|
||||
## Task overview
|
||||
|
||||
```
|
||||
Wave 1 ─┬─ T49: events table + lifecycle handlers
|
||||
├─ T50: time_skip event kinds + handlers (advance chat clock)
|
||||
└─ T51: threads table + open/update/close handlers
|
||||
|
||||
Wave 2 ─┬─ T52: event-lifecycle detection service (narrative → state changes)
|
||||
├─ T53: skip narration service (elision + jump prose)
|
||||
├─ T54: synthesized-memories service (jump skip "anything notable?")
|
||||
└─ T55: thread-detection service (on scene close, identify open threads)
|
||||
|
||||
Wave 3 ─┬─ T56: event-completion promotion (inventory / edges / memories)
|
||||
├─ T57: significance retrieval ranking refinements
|
||||
└─ T58: scene compression keeps key quotes when significance ≥ 2
|
||||
|
||||
Wave 4 ─── T59: drawer additions — events panel, threads panel, skip controls
|
||||
|
||||
Wave 5a ─┬─ T60: prompt assembly includes active events + active threads
|
||||
└─ T61: turn flow invokes event-detection + thread-update per turn
|
||||
|
||||
Wave 5b ─── T62: skip command surface (parse + route + jump UI prompt)
|
||||
|
||||
Wave 6 ─┬─ T63: meanwhile scene config — schema + state + scene-config-4 marker
|
||||
└─ (after T63 merges)
|
||||
├─ T64: meanwhile turn flow (host+guest, no "you")
|
||||
└─ T65: meanwhile summary digest (briefs you on next active scene)
|
||||
|
||||
Wave 7 ─┬─ T66: cross-feature integration tests (events × skips × threads × meanwhile)
|
||||
└─ T67: Phase 3 documentation update
|
||||
```
|
||||
|
||||
Critical path: 8 sequential merge points (Waves 1, 2, 3, 4, 5a, 5b, 6a, 6b, 7). Total tasks: 19. Wall-clock parallelism advantage depends on subagent dispatch overhead, but in principle each wave's tasks can run concurrently in ~the time of one task.
|
||||
|
||||
---
|
||||
|
||||
## Wave 1 — Schema & state foundation
|
||||
|
||||
These three tasks are **fully independent**: each adds a new SQL migration + new state module. T50 also adds two handlers to `chat/state/world.py` (additive, alongside Phase 2's `_apply_guest_added`).
|
||||
|
||||
### Task 49: Events table + lifecycle handlers
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `chat/db/migrations/0009_events.sql`
|
||||
- Create: `chat/state/events.py`
|
||||
- Create: `tests/test_events_state.py`
|
||||
|
||||
**Spec:** Adds the `events` table and projector handlers for the lifecycle: `event_planned`, `event_started`, `event_completed`, `event_cancelled`, `event_expired`. Each event row carries `chat_id`, `kind` (free-form domain-event tag like `"date_at_park"`), `status` (`planned|active|completed|cancelled|expired`), `props_json` (arbitrary blob), `planned_for` (ISO-8601 chat-clock string, optional), `started_at` / `completed_at` (chat-clock strings).
|
||||
|
||||
**Step 1: failing test** — see pattern in `tests/test_group_node.py` (Phase 2 T36). Three tests minimum:
|
||||
|
||||
1. `test_event_planned_creates_row`: append `event_planned` with `kind`, `props_json`, `planned_for`; project; assert `get_event(conn, event_id)` returns the row with `status="planned"`.
|
||||
2. `test_event_started_then_completed_updates_status`: append `event_planned` → `event_started` → `event_completed`; assert `status` transitions and `completed_at` populated.
|
||||
3. `test_event_cancelled_terminal`: append `event_planned` → `event_cancelled`; assert `status="cancelled"`. A subsequent `event_started` is ignored (handler no-op when status is terminal).
|
||||
|
||||
**Step 3: implementation** — `0009_events.sql`:
|
||||
|
||||
```sql
|
||||
CREATE TABLE events (
|
||||
id INTEGER PRIMARY KEY,
|
||||
chat_id TEXT NOT NULL,
|
||||
kind TEXT NOT NULL,
|
||||
status TEXT NOT NULL DEFAULT 'planned',
|
||||
props_json TEXT NOT NULL DEFAULT '{}',
|
||||
planned_for TEXT,
|
||||
started_at TEXT,
|
||||
completed_at TEXT,
|
||||
created_at TEXT NOT NULL DEFAULT (datetime('now')),
|
||||
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
|
||||
);
|
||||
CREATE INDEX events_chat_idx ON events(chat_id, status);
|
||||
```
|
||||
|
||||
`chat/state/events.py`:
|
||||
|
||||
- `@on("event_planned")` inserts a new row with status `planned`. Payload provides a stable `event_id` (caller-allocated UUID) so the projector is idempotent.
|
||||
- `@on("event_started")` updates status to `active` and sets `started_at` from payload (or current chat clock).
|
||||
- `@on("event_completed")`, `@on("event_cancelled")`, `@on("event_expired")` each move to the named terminal state and stamp `completed_at` (the column doubles as "ended at").
|
||||
- `get_event(conn, event_id)`, `list_active_events(conn, chat_id)`, `list_events_in_status(conn, chat_id, status)` readers.
|
||||
- All handlers no-op when the row is already in a terminal state (idempotent re-projection safety).
|
||||
|
||||
**Step 5: commit** — `feat: events table + lifecycle handlers (T49)`.
|
||||
|
||||
**Notes for the implementer:**
|
||||
|
||||
- Use UUID-style ids (e.g., `f"evt_{uuid.uuid4().hex[:12]}"`) created by the caller; pass as `event_id` in payload. Don't auto-generate inside the projector.
|
||||
- Schema version after this migration alone: 9. The full Phase 3 baseline is 10 (T51 adds 0010_threads.sql).
|
||||
- `tests/test_world.py::test_schema_version_after_migration_is_8` will need to bump after Wave 1 merges — handle in the wave-merge step (mirrors Phase 2 T36's pattern).
|
||||
|
||||
---
|
||||
|
||||
### Task 50: Time-skip event kinds + chat-clock handlers
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/state/world.py` (add `_apply_time_skip_elision`, `_apply_time_skip_jump`; both update `chats.time` and may reset `activity` rows)
|
||||
- Create: `tests/test_time_skip_handlers.py`
|
||||
|
||||
**Spec:** Two new event kinds.
|
||||
|
||||
- `time_skip_elision` payload: `{chat_id, new_time}`. Handler updates `chats.time = ?`. Activity rows are NOT reset (the activity that was elided to its end-state is the resolution itself; the caller passes a follow-up `activity_changed` event when needed).
|
||||
- `time_skip_jump` payload: `{chat_id, new_time, reset_activity: bool}`. Handler updates `chats.time = ?`; if `reset_activity` is true, deletes per-chat `activity` rows for the participants in that chat (a fresh landing state will be set by a follow-up `activity_changed` event from the skip service).
|
||||
|
||||
These are pure state mutations. T54 and T62 fire them via `append_and_apply`.
|
||||
|
||||
**Tests:** 3 minimum.
|
||||
|
||||
1. `test_elision_advances_chat_clock_only`: seed chat at time T0; append `time_skip_elision` with `new_time=T1`; project; assert `get_chat(...)["time"] == T1` and activity unchanged.
|
||||
2. `test_jump_with_reset_clears_activity`: seed chat with one activity row; append `time_skip_jump` with `reset_activity=True`; assert chat clock advanced AND activity table empty for that chat.
|
||||
3. `test_jump_without_reset_preserves_activity`: same seed; `reset_activity=False`; assert activity row still present and clock advanced.
|
||||
|
||||
**Implementation:** new handlers next to `_apply_chat_created` in `chat/state/world.py`. Use the same parameterized SQL patterns. Do NOT add UI here — T62 wires the skip command flow.
|
||||
|
||||
**Commit:** `feat: time_skip event handlers (T50)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 51: Threads table + open/update/close handlers
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `chat/db/migrations/0010_threads.sql`
|
||||
- Create: `chat/state/threads.py`
|
||||
- Create: `tests/test_threads_state.py`
|
||||
|
||||
**Spec:** Adds the `threads` table and projector handlers for `thread_opened`, `thread_updated`, `thread_closed`. A thread is a per-chat narrative continuity tag — open during scenes, surfaced to prompt assembly so successor scenes can reference unresolved arcs.
|
||||
|
||||
`0010_threads.sql`:
|
||||
|
||||
```sql
|
||||
CREATE TABLE threads (
|
||||
id INTEGER PRIMARY KEY,
|
||||
chat_id TEXT NOT NULL,
|
||||
title TEXT NOT NULL,
|
||||
summary TEXT NOT NULL DEFAULT '',
|
||||
status TEXT NOT NULL DEFAULT 'open', -- open | closed
|
||||
opened_at TEXT NOT NULL DEFAULT (datetime('now')),
|
||||
closed_at TEXT,
|
||||
last_referenced_scene_id INTEGER,
|
||||
created_at TEXT NOT NULL DEFAULT (datetime('now')),
|
||||
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
|
||||
);
|
||||
CREATE INDEX threads_chat_status_idx ON threads(chat_id, status);
|
||||
```
|
||||
|
||||
`chat/state/threads.py`:
|
||||
|
||||
- `@on("thread_opened")` payload: `{thread_id, chat_id, title, summary?}`. Inserts a new row with `status='open'`.
|
||||
- `@on("thread_updated")` payload: `{thread_id, summary, last_referenced_scene_id?}`. Updates summary + optional last-referenced-scene pointer.
|
||||
- `@on("thread_closed")` payload: `{thread_id, closed_at?}`. Sets status='closed', stamps `closed_at`.
|
||||
- Readers: `get_thread(conn, thread_id)`, `list_open_threads(conn, chat_id)`, `list_threads(conn, chat_id, status=None)`.
|
||||
|
||||
**Tests:** 3 minimum.
|
||||
|
||||
1. `test_thread_opened_creates_row`.
|
||||
2. `test_thread_updated_changes_summary_and_last_referenced`.
|
||||
3. `test_thread_closed_terminal`: subsequent `thread_updated` is ignored (matches the design's "closed threads are kept for replay but don't surface in prompt").
|
||||
|
||||
**Note:** the Phase 2 `group_node.threads_json` column was a Phase-3 placeholder and is NOT used as authoritative storage now — `threads` table is the source of truth. The drawer can choose to render either, but Phase 3 onward should treat the table as canonical and treat `group_node.threads_json` as a deprecated cache that we leave alone (or clear in the next migration).
|
||||
|
||||
**Commit:** `feat: threads table + projector handlers (T51)`.
|
||||
|
||||
---
|
||||
|
||||
## Wave 2 — Classifier services (parallel)
|
||||
|
||||
Four tasks, all new service modules — fully file-disjoint.
|
||||
|
||||
### Task 52: Event-lifecycle detection service
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `chat/services/event_lifecycle.py`
|
||||
- Create: `tests/test_event_lifecycle.py`
|
||||
|
||||
**Spec:** A classifier-wrapped service that inspects a freshly-narrated turn and decides whether any active events transitioned this turn (started, completed, cancelled). Returns a structured `EventLifecycleDecision` with one or more `EventTransition(event_id, new_status, reason)` items, or empty when nothing changed.
|
||||
|
||||
Schema:
|
||||
|
||||
```python
|
||||
class EventTransition(BaseModel):
|
||||
event_id: str
|
||||
new_status: str # "active" | "completed" | "cancelled"
|
||||
reason: str = ""
|
||||
|
||||
class EventLifecycleDecision(BaseModel):
|
||||
transitions: list[EventTransition] = Field(default_factory=list)
|
||||
```
|
||||
|
||||
Public API:
|
||||
|
||||
```python
|
||||
async def detect_event_transitions(
|
||||
client: LLMClient,
|
||||
*,
|
||||
classifier_model: str,
|
||||
narrative_text: str,
|
||||
active_events: list[dict], # [{id, kind, status, props}, ...] from list_active_events
|
||||
timeout_s: float = 30.0,
|
||||
) -> EventLifecycleDecision:
|
||||
"""Decide whether any active events transitioned this turn. Conservative
|
||||
bias — most turns return empty transitions. Trigger only when the
|
||||
narrative text clearly resolves or starts a known active event.
|
||||
"""
|
||||
```
|
||||
|
||||
Caller (T61 turn flow) appends one `event_started` / `event_completed` / `event_cancelled` event per transition via `append_and_apply`.
|
||||
|
||||
**Tests:** 3 minimum — happy path with one transition, empty active_events short-circuits without classifier call, classifier failure returns empty default.
|
||||
|
||||
**Commit:** `feat: event-lifecycle detection service (T52)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 53: Skip narration service
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `chat/services/skip_narration.py`
|
||||
- Create: `tests/test_skip_narration.py`
|
||||
|
||||
**Spec:** Generates the brief transition narration that bridges a time skip. Two flavors mirroring §9:
|
||||
|
||||
- **Elision:** "skip to when we arrive". Input: current activity ("walking to park"), expected end-state ("at the park, sitting on a bench"). Output: 1-2 sentence transition prose narrated from the host bot's POV. New chat-clock value is provided by the caller.
|
||||
- **Jump:** "next morning". Input: time delta + landing-state hint (optional). Output: 2-3 sentences setting the scene at the new time.
|
||||
|
||||
Public API:
|
||||
|
||||
```python
|
||||
async def narrate_skip(
|
||||
client: LLMClient,
|
||||
*,
|
||||
narrative_model: str,
|
||||
skip_kind: str, # "elision" | "jump"
|
||||
speaker_bot: dict, # {id, name, persona}
|
||||
you_name: str,
|
||||
current_time: str,
|
||||
new_time: str,
|
||||
current_activity: str,
|
||||
landing_state_hint: str = "",
|
||||
timeout_s: float = 60.0,
|
||||
) -> str:
|
||||
"""Generate brief transition prose. Returns plain text, not JSON."""
|
||||
```
|
||||
|
||||
Uses `client.generate(...)` (not `classify`) since output is free-form prose. Falls back to a deterministic template string on failure (e.g., `f"({new_time}: {landing_state_hint or current_activity}.)"`). The fallback ensures the skip flow never blocks even when the LLM is down.
|
||||
|
||||
**Tests:** 3 minimum — happy elision, happy jump, generation failure returns fallback string with the new time visible.
|
||||
|
||||
**Commit:** `feat: skip narration service (T53)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 54: Synthesized-memories service
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `chat/services/synthesized_memories.py`
|
||||
- Create: `tests/test_synthesized_memories.py`
|
||||
|
||||
**Spec:** When the user does a jump skip ("a week later") they're prompted "anything notable happen?" If they answer with prose, this service parses that prose into 1-N synthesized memories per present bot. Each memory carries `source="synthesized"`, `reliability=0.7`, witness mask `[1, 1, 0]` or `[1, 1, 1]` per present set, and a one-sentence text body.
|
||||
|
||||
Schema:
|
||||
|
||||
```python
|
||||
class SynthesizedMemory(BaseModel):
|
||||
text: str
|
||||
significance: int = 1 # 0..3, default 1
|
||||
affinity_delta: int = 0
|
||||
trust_delta: int = 0
|
||||
|
||||
class SynthesizedDigest(BaseModel):
|
||||
memories: list[SynthesizedMemory] = Field(default_factory=list)
|
||||
```
|
||||
|
||||
Public API:
|
||||
|
||||
```python
|
||||
async def synthesize_memories(
|
||||
client: LLMClient,
|
||||
*,
|
||||
classifier_model: str,
|
||||
prose: str,
|
||||
bot_name: str, # which witness's POV
|
||||
bot_persona: str,
|
||||
you_name: str,
|
||||
timeout_s: float = 30.0,
|
||||
) -> SynthesizedDigest:
|
||||
"""Parse 'anything notable happen?' prose into structured memories
|
||||
from a single bot's POV. Empty/whitespace prose short-circuits."""
|
||||
```
|
||||
|
||||
Caller (T62 skip flow) calls this once per present bot (host always; guest if present), then writes via `record_turn_memory_for_present` with `source="synthesized"` and the synthesized text in place of narrative_text.
|
||||
|
||||
**Tests:** 3 minimum — happy path returns parseable memories, empty prose short-circuits, classifier failure returns empty digest.
|
||||
|
||||
**Commit:** `feat: synthesized-memories service for jump skips (T54)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 55: Thread-detection service
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `chat/services/thread_detection.py`
|
||||
- Create: `tests/test_thread_detection.py`
|
||||
|
||||
**Spec:** On scene close, classify the scene transcript to detect open threads (unresolved arcs, dangling questions, promises made). Returns a list of `ThreadCandidate(title, summary, action: "open"|"update"|"close", existing_thread_id?)`.
|
||||
|
||||
The service receives the current set of open threads so it can decide to **update** an existing thread rather than open a duplicate. It can also signal **close** when the transcript clearly resolves an open thread.
|
||||
|
||||
Schema:
|
||||
|
||||
```python
|
||||
class ThreadCandidate(BaseModel):
|
||||
action: str # "open" | "update" | "close"
|
||||
title: str = "" # required for "open"; ignored otherwise
|
||||
summary: str = ""
|
||||
existing_thread_id: str | None = None # required for "update"/"close"
|
||||
|
||||
class ThreadDetectionResult(BaseModel):
|
||||
candidates: list[ThreadCandidate] = Field(default_factory=list)
|
||||
```
|
||||
|
||||
Public API:
|
||||
|
||||
```python
|
||||
async def detect_threads(
|
||||
client: LLMClient,
|
||||
*,
|
||||
classifier_model: str,
|
||||
scene_transcript: list[dict], # [{speaker, text}, ...]
|
||||
open_threads: list[dict], # [{id, title, summary}, ...]
|
||||
timeout_s: float = 30.0,
|
||||
) -> ThreadDetectionResult:
|
||||
"""Classify scene close into thread open/update/close candidates."""
|
||||
```
|
||||
|
||||
Caller (T58 scene compression — added in Wave 3) loops over candidates and emits one `thread_opened`, `thread_updated`, or `thread_closed` event per candidate.
|
||||
|
||||
**Tests:** 3 minimum — opens a new thread, updates an existing thread (test asserts `existing_thread_id` is honored), classifier failure returns empty.
|
||||
|
||||
**Commit:** `feat: thread-detection service (T55)`.
|
||||
|
||||
---
|
||||
|
||||
## Wave 3 — Promotion & retrieval refinements
|
||||
|
||||
Three tasks. T56 is a new service module (event-completion promotion). T57 modifies `chat/state/memory.py` to add a significance-aware retrieval rank. T58 modifies `chat/services/scene_summarize.py` to integrate compression hints + the thread-detection service from T55. File-disjoint.
|
||||
|
||||
### Task 56: Event-completion promotion
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `chat/services/event_promotion.py`
|
||||
- Create: `tests/test_event_promotion.py`
|
||||
|
||||
**Spec:** When an event reaches `completed` (the only terminal state that promotes; cancelled/expired do NOT promote per §9 last paragraph), the orchestrator promotes any structured artifacts the event carried into the appropriate target store:
|
||||
|
||||
- `event.props.acquired_objects: list[str]` → append `inventory_added` events (Phase 4 schema; Phase 3 stub: just append a `manual_edit` with `target_kind="memory_pov_summary"` describing the acquisition into the host's memory).
|
||||
- `event.props.knowledge_facts: list[{owner_id, target_id, fact}]` → append `edge_update` events with the facts on the named directed edge.
|
||||
- `event.props.relationship_change: {summary, source_id, target_id}` → append `manual_edit` with `target_kind="edge_summary"` for that pair.
|
||||
- Everything else stays in the closed event record (the projector kept the row; no further promotion).
|
||||
|
||||
Public API:
|
||||
|
||||
```python
|
||||
def promote_completed_event(
|
||||
conn,
|
||||
*,
|
||||
event_id: str,
|
||||
chat_id: str,
|
||||
chat_clock_at: str | None,
|
||||
) -> dict:
|
||||
"""Read the completed event's props_json and emit promotion events.
|
||||
Returns a summary dict {inventory: int, knowledge: int, relationship: int}
|
||||
of how many promotion events fired. No classifier calls — purely
|
||||
structural. Skips if event status isn't 'completed'."""
|
||||
```
|
||||
|
||||
This is **synchronous** (no async, no LLM). It reads a row, parses JSON, emits events via `append_and_apply`.
|
||||
|
||||
**Tests:** 4 minimum — empty props no-op, knowledge_facts produces edge_update events, relationship_change produces manual_edit, cancelled-event-doesn't-promote.
|
||||
|
||||
**Commit:** `feat: event-completion promotion service (T56)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 57: Significance-aware retrieval ranking
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/state/memory.py` (extend `search_memories(conn, owner_id, witness_role, query, k)` to add a significance bias to the rank ordering)
|
||||
- Modify: `tests/test_memory_search.py` (or wherever the existing search tests live; add 2 tests)
|
||||
|
||||
**Spec:** Currently `search_memories` orders by FTS rank only. §11.1 says "Retrieval ranking: significance multiplier applied as `score × constant` to FTS / vector rank." Phase 3 implements this for FTS only (vector retrieval is Phase 4).
|
||||
|
||||
Change the SQL `ORDER BY` from `ORDER BY rank` to `ORDER BY (rank + significance * 0.5) DESC` (or whatever scaling produces sane results — this is a tuning knob, document the choice in a comment). The constant may need adjustment after manual play; surface it as a module-level constant `SIGNIFICANCE_RANK_BIAS`.
|
||||
|
||||
**Tests:** 2 added.
|
||||
|
||||
1. `test_higher_significance_outranks_equal_rank`: seed two memories with identical FTS-matching text but different significance scores; assert the higher-significance row appears first in results.
|
||||
2. `test_significance_bias_is_constant_module_level`: verify the constant is accessible as `chat.state.memory.SIGNIFICANCE_RANK_BIAS` (so it's tunable without a code change in calling sites).
|
||||
|
||||
**Commit:** `feat: significance-aware retrieval ranking (T57)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 58: Scene compression keeps key quotes when significance ≥ 2
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/services/scene_summarize.py` (extend `apply_scene_close_summary` to also call `detect_threads` from T55 and emit thread events; extend the per-POV summary to include up to 3 verbatim "key quotes" from the closing scene when scene-max-significance ≥ 2)
|
||||
- Modify: `tests/test_per_pov_summary.py` (add 3 tests for the new behavior)
|
||||
|
||||
**Spec:** §11.1 specifies "Compression: scenes with max-turn-significance ≥ 2 retain key quotes; ≤ 1 collapse fully into the per-POV summary." Implement this:
|
||||
|
||||
- Compute scene max significance from `memories.significance` rows in this scene.
|
||||
- When max < 2: existing behavior unchanged (per-POV summary, no extra quotes).
|
||||
- When max ≥ 2: include up to 3 verbatim quote spans (each ≤ 200 chars) in the per-POV summary text. Format: append `\n\nKey quotes:\n- "..."\n- "..."` to the summary. The `summarize_scene` classifier already produces the prose; the quote-selection step is a deterministic post-process that picks the top-3 highest-significance turn texts from the scene transcript (truncated).
|
||||
|
||||
Additionally, after writing per-POV summaries (existing behavior), call `detect_threads` (from T55) once per close. For each candidate emit the matching `thread_opened` / `thread_updated` / `thread_closed` event via `append_and_apply`. Failures fall back to no thread changes (existing memory + edge updates still land).
|
||||
|
||||
**Tests:** 3 added.
|
||||
|
||||
1. `test_low_significance_scene_omits_quotes`: max significance = 1; assert summary text contains no "Key quotes:" header.
|
||||
2. `test_high_significance_scene_includes_top_3_quotes`: seed 4 memories with significance 3, 2, 1, 2; assert summary contains the top-3 (by significance) verbatim turn texts.
|
||||
3. `test_thread_detection_emits_events`: stub `detect_threads` to return one `ThreadCandidate(action="open", ...)`; assert a `thread_opened` event landed.
|
||||
|
||||
**Commit:** `feat: significance-driven quote retention + thread emission on close (T58)`.
|
||||
|
||||
---
|
||||
|
||||
## Wave 4 — Drawer additions (single task)
|
||||
|
||||
This wave is one task because all Phase 3 drawer additions touch `chat/web/drawer.py` and `chat/templates/_drawer.html` together — splitting would force serial execution with conflicts.
|
||||
|
||||
### Task 59: Drawer events / threads / skip controls
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/web/drawer.py` (extend `GET /chats/{chat_id}/drawer`; add `POST /chats/{chat_id}/drawer/event/plan`, `/drawer/event/cancel/{event_id}`, `/drawer/skip/elision`, `/drawer/skip/jump`, `/drawer/thread/close/{thread_id}`)
|
||||
- Modify: `chat/templates/_drawer.html` (3 new sections: Events, Threads, Skip controls)
|
||||
- Create: `tests/test_drawer_events_threads_skip.py`
|
||||
|
||||
**Spec:**
|
||||
|
||||
**GET extension:**
|
||||
|
||||
- `list_active_events(conn, chat_id)` → render in a new "Events" section.
|
||||
- `list_open_threads(conn, chat_id)` → render in a new "Threads" section.
|
||||
- A "Skip" subsection with two buttons: "Elision skip" (opens an inline form taking a `landing_state_hint`) and "Jump skip" (opens an inline form taking `target_time` ISO + optional `notable_prose` for the synthesized-memories prompt).
|
||||
|
||||
**POST routes:**
|
||||
|
||||
1. `POST /drawer/event/plan` — form `{kind, planned_for, props_json}` → 400-validates JSON, appends `event_planned`, returns refreshed drawer.
|
||||
2. `POST /drawer/event/cancel/{event_id}` — appends `event_cancelled`, returns refreshed drawer.
|
||||
3. `POST /drawer/skip/elision` — form `{landing_state_hint, new_time}` → calls `narrate_skip` (T53), appends `time_skip_elision` + an `assistant_turn` carrying the narration, returns refreshed drawer + chat partial.
|
||||
4. `POST /drawer/skip/jump` — form `{new_time, notable_prose, reset_activity}` → calls `narrate_skip` for transition prose, calls `synthesize_memories` (T54) for each present bot, appends `time_skip_jump` + memories + transition turn, returns refreshed drawer + chat partial.
|
||||
5. `POST /drawer/thread/close/{thread_id}` — appends `thread_closed`, returns refreshed drawer.
|
||||
|
||||
**Template additions:**
|
||||
|
||||
- "Events" section listing each active event by kind + planned_for + props.
|
||||
- "Threads" section listing each open thread title + summary + a Close button.
|
||||
- "Skip" controls under existing Activity section.
|
||||
- Forms use HTMX (`hx-post`, `hx-target="#drawer"`, `hx-swap="innerHTML"`) consistent with Phase 2 drawer patterns.
|
||||
|
||||
**Tests (`tests/test_drawer_events_threads_skip.py`):** 6 minimum.
|
||||
|
||||
1. GET drawer with no events/threads → no Events/Threads sections rendered.
|
||||
2. POST event/plan with valid form → event_planned event appended; drawer body now contains the event title.
|
||||
3. POST event/cancel → event_cancelled appended; drawer no longer lists the event under "Active".
|
||||
4. POST skip/elision → time_skip_elision appended, chat clock advanced, narration assistant_turn present in chat history.
|
||||
5. POST skip/jump with notable_prose → time_skip_jump + N synthesized memory_written events; assert reliability=0.7 on those rows.
|
||||
6. POST thread/close → thread_closed appended; thread no longer in open list.
|
||||
|
||||
**Commit:** `feat: drawer events / threads / skip controls (T59)`.
|
||||
|
||||
**Notes for implementer:**
|
||||
|
||||
- The existing `available_guests` dropdown helper from T42 is the reference for form-population patterns.
|
||||
- For the Jump skip's `notable_prose` field, treat empty as "no synthesized memories" (just advance the clock) — the spec allows this.
|
||||
- Validate `target_time` ISO format; 400 on parse failure. Do not allow target_time earlier than current chat clock.
|
||||
|
||||
---
|
||||
|
||||
## Wave 5a — Prompt + turn-flow integration (parallel)
|
||||
|
||||
T60 modifies `chat/services/prompt.py`. T61 modifies `chat/web/turns.py`. File-disjoint.
|
||||
|
||||
### Task 60: Prompt assembly includes active events + active threads
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/services/prompt.py` (extend `assemble_narrative_prompt`)
|
||||
- Modify: `tests/test_prompt.py` (add 3 tests)
|
||||
|
||||
**Spec:** Two new SHOULD-tier blocks added between the existing scene-context block and retrieved-memories block:
|
||||
|
||||
1. **Active events** — title `Active events:`. Lists each active event in this chat: `- {kind} (planned for {planned_for})` plus a one-line props excerpt (truncate to ~80 chars). Trim-tier SHOULD; drops before retrieved memories under tight budget.
|
||||
2. **Active threads** — title `Open threads:`. Lists each open thread: `- {title}: {summary}` (summary truncated to ~120 chars). SHOULD-tier.
|
||||
|
||||
Both blocks are omitted entirely when their lists are empty (no header rendered).
|
||||
|
||||
Per Phase 2 T43's auto-detection precedent, the function reads `list_active_events(conn, chat_id)` and `list_open_threads(conn, chat_id)` itself; no new parameters.
|
||||
|
||||
**Tests:** 3 added.
|
||||
|
||||
1. `test_assemble_with_no_events_or_threads_omits_blocks` — regression; no events/threads → assembled prompt has neither block.
|
||||
2. `test_assemble_with_active_events_renders_block` — seed one event_planned + event_started; assert "Active events:" header and event kind appear in prompt.
|
||||
3. `test_assemble_with_open_thread_renders_block` — seed one thread_opened; assert "Open threads:" header and thread title appear.
|
||||
|
||||
**Commit:** `feat: prompt assembly renders active events + open threads (T60)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 61: Turn flow invokes event-detection + thread-update per turn
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/web/turns.py` (after the primary narrative + memory + state-update block, call `detect_event_transitions` from T52; emit `event_started`/`event_completed`/`event_cancelled` events accordingly)
|
||||
- Modify: `chat/services/regenerate.py` (mirror — regenerate also re-detects event transitions for the regenerated turn)
|
||||
- Modify: `tests/test_turn_flow.py` (add 3 tests)
|
||||
|
||||
**Spec:** After the existing post-turn classifier passes (memory write, state update, interjection check) and BEFORE scene-close detection, call `detect_event_transitions` with `narrative_text=primary_text` and `active_events=list_active_events(conn, chat_id)`.
|
||||
|
||||
For each `EventTransition` returned:
|
||||
|
||||
- `new_status="active"` → append `event_started` payload `{event_id, started_at: chat.time}`.
|
||||
- `new_status="completed"` → append `event_completed` payload `{event_id, completed_at: chat.time}` AND THEN call `promote_completed_event` (T56) inline so promotion events emit synchronously after completion.
|
||||
- `new_status="cancelled"` → append `event_cancelled`. Promotion is skipped.
|
||||
|
||||
Empty transitions list = no-op (most turns; no extra events written).
|
||||
|
||||
`regenerate.py` mirrors the same logic for the regenerated turn (existing event transitions from the superseded turn are NOT undone — that's a Phase 3.5 follow-up; document the limitation).
|
||||
|
||||
**Tests:** 3 added to `tests/test_turn_flow.py`.
|
||||
|
||||
1. `test_turn_with_event_transition_appends_started_event`: mock `detect_event_transitions` to return one transition; assert `event_started` lands in event log; canned-response queue matches.
|
||||
2. `test_turn_with_event_completion_runs_promotion`: same mock returning `new_status="completed"`; seed a planned event with knowledge_facts in props; assert `event_completed` + `edge_update` (from promotion) both land.
|
||||
3. `test_turn_with_no_active_events_skips_classifier`: no active events; assert `detect_event_transitions` is never called (its canned response slot would still be in the queue at end of test).
|
||||
|
||||
**Commit:** `feat: per-turn event-lifecycle detection + completion promotion (T61)`.
|
||||
|
||||
---
|
||||
|
||||
## Wave 5b — Skip command flow (single task)
|
||||
|
||||
Single task because it modifies `chat/web/turns.py` (which Wave 5a also touched). Run after Wave 5a is merged so the file's recent additions are stable.
|
||||
|
||||
### Task 62: Skip command surface
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/web/turns.py` (extend `parse_turn` to detect natural-language skip commands like "skip to the park", "next morning", "a week later" and route to a skip-handling branch BEFORE the normal narrative flow)
|
||||
- Create: `chat/web/skip.py` (new module hosting `process_elision_skip(...)` and `process_jump_skip(...)` controllers; called by both turns.py and the drawer skip routes from T59)
|
||||
- Modify: `tests/test_turn_flow.py` (add 3 tests)
|
||||
|
||||
**Spec:** Currently `parse_turn` extracts the user's prose into structured fields (addressee inferred, etc.). Phase 3 adds detection of skip commands as a separate intent.
|
||||
|
||||
The classifier-based parse already produces an `intent` field (or similar — verify in code). Extend the schema with `intent="skip_elision"` and `intent="skip_jump"`. When intent is one of these, the turn flow short-circuits the normal narrative path and routes to:
|
||||
|
||||
- `process_elision_skip(conn, client, settings, *, chat_id, landing_state_hint=parsed.landing_state)` — calls `narrate_skip(skip_kind="elision")`, appends `time_skip_elision`, `assistant_turn` carrying narration, returns 204.
|
||||
- `process_jump_skip(conn, client, settings, *, chat_id, target_time=parsed.target_time, notable_prose=parsed.notable_prose)` — appends `time_skip_jump`, calls `synthesize_memories` per present bot, appends synthesized `memory_written` events, calls `narrate_skip(skip_kind="jump")`, appends `assistant_turn` carrying transition prose, returns 204.
|
||||
|
||||
The drawer routes from T59 share these functions (don't duplicate the logic across drawer.py and turns.py).
|
||||
|
||||
For Phase 3's first cut, JUMP skip's `notable_prose` is NOT collected from natural-language ("a week later, anything notable?" requires a UI prompt). Two options:
|
||||
|
||||
- **(simpler)** Drawer-only entry for jump skip; natural-language jump short-circuits to drawer prompt.
|
||||
- **(better UX)** Natural-language jump returns a 422 with an HTMX-swap that injects the "anything notable?" textarea into the chat surface; user submits prose to a follow-up `/chats/{chat_id}/skip/jump/confirm` endpoint.
|
||||
|
||||
Pick the simpler path for Phase 3 (drawer-only jump). Document the second option as a Phase 3.5 polish.
|
||||
|
||||
**Tests:** 3 added.
|
||||
|
||||
1. `test_elision_skip_via_natural_language` — user prose "skip to when we arrive at the park"; assert `time_skip_elision` event landed and chat clock advanced; an `assistant_turn` carrying transition prose was appended.
|
||||
2. `test_jump_skip_via_natural_language_redirects_to_drawer` — user prose "next morning"; assert response is 422 with an HTMX swap pointing at the drawer's jump form (or whatever the chosen Phase 3 fallback is).
|
||||
3. `test_skip_command_does_not_run_narrative_classifier` — same user prose as test 1; assert `assemble_narrative_prompt` was NOT called for a regular bot turn (the skip path bypasses it).
|
||||
|
||||
**Commit:** `feat: natural-language skip detection + skip command flow (T62)`.
|
||||
|
||||
---
|
||||
|
||||
## Wave 6 — Meanwhile scenes
|
||||
|
||||
Phase 3's capstone feature. Most ambitious: scene config 4 (host + guest, no "you"). Per §13 the cap stays at 2 bots in any scene; meanwhile is two-bot bot↔bot. "You" receives a digest later, not during.
|
||||
|
||||
Decomposed into 3 tasks. T63 lands first (schema + state); then T64 + T65 in parallel.
|
||||
|
||||
### Task 63: Meanwhile scene config — schema + state
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `chat/db/migrations/0011_meanwhile_scenes.sql`
|
||||
- Create: `chat/state/meanwhile.py`
|
||||
- Create: `tests/test_meanwhile_state.py`
|
||||
|
||||
**Spec:** A meanwhile scene is a special kind of scene where `present_set = {host_bot_id, guest_bot_id}` (no "you"). The existing `scenes` table can carry it via a new `present_set_kind` column distinguishing `you_host`, `you_host_guest`, `host_guest`. Alternatively, `meanwhile_scenes` is a sidecar table — pick the lower-disruption option.
|
||||
|
||||
**Recommended:** add a `present_set_kind` column to `scenes` (default `'you_host'` for back-compat) via migration `0011_meanwhile_scenes.sql`:
|
||||
|
||||
```sql
|
||||
ALTER TABLE scenes ADD COLUMN present_set_kind TEXT NOT NULL DEFAULT 'you_host';
|
||||
ALTER TABLE scenes ADD COLUMN parent_scene_id INTEGER; -- the active you-scene this meanwhile branched off from
|
||||
CREATE INDEX scenes_present_set_idx ON scenes(chat_id, present_set_kind, status);
|
||||
```
|
||||
|
||||
New event kinds with `chat/state/meanwhile.py` handlers:
|
||||
|
||||
- `@on("meanwhile_scene_started")` payload: `{chat_id, scene_id, host_bot_id, guest_bot_id, parent_scene_id, started_at}`. Inserts a new scene row with `present_set_kind="host_guest"`, links to parent.
|
||||
- `@on("meanwhile_scene_closed")` payload: `{scene_id, closed_at}`. Updates status to `closed`; subsequent per-POV summary writes for both bots happen via existing scene-close path (host + guest are the "present witnesses"; "you" is excluded).
|
||||
|
||||
Readers: `list_meanwhile_scenes(conn, chat_id, status='active')`, `get_parent_scene(conn, scene_id)`.
|
||||
|
||||
**Tests:** 3 minimum.
|
||||
|
||||
1. `test_meanwhile_started_creates_scene_with_correct_present_set_kind`.
|
||||
2. `test_meanwhile_closed_marks_scene_closed`.
|
||||
3. `test_active_you_scene_can_coexist_with_active_meanwhile_scene` (one chat, two active scenes — meanwhile + the main you-scene that spawned it).
|
||||
|
||||
**Commit:** `feat: meanwhile scene schema + state (T63)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 64: Meanwhile turn flow
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/web/turns.py` (add meanwhile-mode detection at the start of `post_turn`; if active meanwhile scene exists for this chat, route to `process_meanwhile_turn`)
|
||||
- Create: `chat/web/meanwhile.py` (new module hosting `process_meanwhile_turn(...)` controller; mirrors post_turn but with no "you" in present_set)
|
||||
- Modify: `chat/services/prompt.py` (small addition: when `present_set_kind="host_guest"`, exclude "you" from edges + activity blocks; addressee is always the other bot)
|
||||
- Create: `tests/test_meanwhile_turn_flow.py`
|
||||
|
||||
**Spec:** A meanwhile scene runs entirely between two bots. The user can advance it manually via a meanwhile-mode chat surface (T65 wires the UI), but turn-flow logic is:
|
||||
|
||||
1. Read active meanwhile scene; identify `speaker_bot_id` (alternates each turn — start with host, then guest, etc.) and `addressee_bot_id` (the other one).
|
||||
2. Assemble narrative prompt with `speaker_bot_id`, `addressee=addressee_bot.name`, `present_set_kind="host_guest"` (so "you" is omitted from edges/activities).
|
||||
3. Stream narrative; commit `assistant_turn` event with `present_set_kind="host_guest"` and `meanwhile_scene_id` populated.
|
||||
4. Memory writes: BOTH host and guest get a memory_written with witness `[0, 1, 1]` (you=0; you wasn't present). Use `record_turn_memory_for_present` adapted to the no-you case (or extend it with a `you_present: bool = True` parameter).
|
||||
5. State updates: 2 directed pairs (host↔guest only). Skip you-related pairs.
|
||||
6. Scene close detection: same path as regular scenes; on close, per-POV summaries fire for both bots; group_node updates if applicable.
|
||||
|
||||
Addressee-alternation: simple — each turn alternates speaker. (Phase 3.5 may add classifier-driven turn-taking with refusals.)
|
||||
|
||||
**Tests:** 4 minimum.
|
||||
|
||||
1. `test_meanwhile_turn_writes_memories_with_witness_0_1_1`.
|
||||
2. `test_meanwhile_turn_emits_2_edge_updates_only` (host→guest, guest→host).
|
||||
3. `test_meanwhile_turn_alternates_speaker` (turn 1: host speaks; turn 2: guest speaks).
|
||||
4. `test_meanwhile_scene_close_writes_per_pov_for_both_bots_only` (no "you" memory; existing T45 path is hit but with `you_present=False`).
|
||||
|
||||
**Commit:** `feat: meanwhile turn flow (host+guest, no you) (T64)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 65: Meanwhile summary digest
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `chat/services/scene_summarize.py` (when a meanwhile scene closes, generate ALSO a "you-facing digest" — a brief narrated summary that will surface to "you" the next time the main you-scene resumes)
|
||||
- Modify: `chat/services/prompt.py` (when assembling for a regular you-scene and any closed-but-not-yet-surfaced meanwhile digests exist, include them as a SHOULD-tier block titled "Meanwhile while you were away:")
|
||||
- Create: `chat/state/meanwhile_digest.py` (a small state module: `meanwhile_digest_pending` table; handlers for `meanwhile_digest_created` / `meanwhile_digest_consumed`)
|
||||
- Modify: `tests/test_per_pov_summary.py` and `tests/test_prompt.py` (add tests)
|
||||
|
||||
**Spec:** When a meanwhile scene closes (T64's path), also append `meanwhile_digest_created` with `{chat_id, scene_id, summary}`. The summary is generated via a fresh `summarize_scene` call with `bot_persona="omniscient narrator briefing the absent player"`; output is a 2-3 sentence neutral summary of what happened.
|
||||
|
||||
When the next you-scene starts (or the prompt is assembled for the next active you-scene's turn), `assemble_narrative_prompt` queries `list_pending_meanwhile_digests(conn, chat_id)` and:
|
||||
|
||||
- Includes them as a SHOULD-tier block: `"Meanwhile while you were away:\n- {summary}\n- {summary}"`.
|
||||
- After they're surfaced once, the caller (T64 in the post-meanwhile turn or the first you-turn after meanwhile-close) appends `meanwhile_digest_consumed` per digest, marking them as surfaced.
|
||||
|
||||
Migration `0011_meanwhile_scenes.sql` (T63) can include the `meanwhile_digest_pending` table OR T65 adds a thin `0012_meanwhile_digest.sql`. Pick lower-disruption — likely add to T63's migration for simplicity. Document the choice.
|
||||
|
||||
(If you choose to add the table in T65 via a new migration, add `0012_meanwhile_digest.sql`. The schema-version assertion bump in `tests/test_world.py` happens once after Wave 6 merges.)
|
||||
|
||||
**Tests:** 3 added.
|
||||
|
||||
1. `test_meanwhile_close_creates_digest`: close a meanwhile scene; assert `meanwhile_digest_pending` row exists with non-empty summary.
|
||||
2. `test_pending_digest_renders_in_you_scene_prompt`: seed a pending digest; assemble prompt for a you-host scene; assert the "Meanwhile while you were away:" header and summary appear.
|
||||
3. `test_consumed_digest_does_not_render_again`: append `meanwhile_digest_consumed`; reassemble prompt; digest no longer appears.
|
||||
|
||||
**Commit:** `feat: meanwhile summary digest surfaces to next you-scene (T65)`.
|
||||
|
||||
---
|
||||
|
||||
## Wave 7 — Polish (parallel)
|
||||
|
||||
Two independent tasks. New test file (T66) + docs only (T67). Dispatch in parallel after Wave 6 merges.
|
||||
|
||||
### Task 66: Cross-feature integration tests
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `tests/test_phase3_integration.py`
|
||||
|
||||
**Spec:** Phase 3 introduces a lot of cross-feature interaction surfaces. This task adds tests that exercise multi-feature flows end-to-end:
|
||||
|
||||
1. Plan an event → play turns → event_started detected → event_completed detected → promotion fires → memory + edge updates land.
|
||||
2. Open a thread on close → next scene's prompt includes the open thread → close thread via drawer → next scene's prompt no longer includes it.
|
||||
3. Jump skip → synthesized memories land per present bot → next turn's prompt retrieves them via search.
|
||||
4. Meanwhile scene → close → digest pending → first you-turn prompt includes digest → after that turn, digest is consumed.
|
||||
5. Meanwhile while a regular you-scene is active → both scenes have memories; querying memories for either bot at the post-meanwhile main scene correctly returns both sets witness-filtered.
|
||||
|
||||
5 tests minimum.
|
||||
|
||||
**Commit:** `test: phase 3 cross-feature integration coverage (T66)`.
|
||||
|
||||
---
|
||||
|
||||
### Task 67: Phase 3 documentation update
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `CLAUDE.md` (add "Phase 3 status" section; update "Behavioral defaults"; add "Phase 3.5 / 4 backlog" with carry-overs from review feedback during execution)
|
||||
- Modify: `docs/plans/2026-04-26-v1-requirements-design.md` (annotate §13 "Phase 3 — events, skips, threads" as **Status: shipped <date>**)
|
||||
|
||||
**Spec:** Documentation-only. Run last so it captures any deviations and review-noted follow-ups discovered during execution. Reflect:
|
||||
|
||||
- Events with full lifecycle (planned → active → completed/cancelled/expired).
|
||||
- Time skips: elision (immediate end-state) + jump (synthesized memories from "anything notable?").
|
||||
- Threads opened/updated/closed; surfaced in prompt assembly + drawer.
|
||||
- Significance retrieval bias + key-quote retention at significance ≥ 2.
|
||||
- Meanwhile scenes: bot+bot without "you"; per-POV summaries for both bots; you-facing digest on next you-scene.
|
||||
- Phase 3 known limitations / 3.5 backlog candidates:
|
||||
- Natural-language jump skip falls back to drawer form (no inline "anything notable?" prompt).
|
||||
- Regenerate doesn't undo prior event transitions from the superseded turn.
|
||||
- Meanwhile turn-taking is alternation (no classifier-driven refusals or initiative).
|
||||
- Vector retrieval is still Phase 4.
|
||||
|
||||
**Commit:** `docs: phase 3 status, behavioral defaults, deferred items (T67)`.
|
||||
|
||||
---
|
||||
|
||||
## Wrap-up
|
||||
|
||||
After Wave 7 lands:
|
||||
|
||||
1. **Run full suite** on `phase-3`: should be ~260+ tests passing (212 from Phase 2 + ~50 new).
|
||||
2. **Manual smoke** (recommended before opening the PR):
|
||||
- Plan an event from the drawer; play turns until it completes; verify promotion landed (drawer shows updated edges / memories).
|
||||
- Use elision and jump skips both via natural language and the drawer.
|
||||
- Close a scene that opened a thread; verify the thread renders in the next scene's prompt.
|
||||
- Trigger a meanwhile scene from the drawer; play 2 turns; close it; resume the main you-scene; verify the digest renders once and not again.
|
||||
3. **Push `phase-3`** to gitea.
|
||||
4. **Open PR** `phase-3 → main`.
|
||||
5. **Phase 3.5 backlog candidates** (track in CLAUDE.md): inline natural-language jump prompt UI, regenerate-aware event-transition undo, classifier-driven meanwhile turn-taking, drawer surface for closed-event browsing, event template library (kind presets with default props).
|
||||
|
||||
---
|
||||
|
||||
## Notes for the controller running this plan
|
||||
|
||||
- **Don't dispatch Wave 5b until Wave 5a is merged AND green on `phase-3`.** Wave 5b's `turns.py` modifications layer on top of T61's recent additions; missing that produces merge conflicts or import-time failures.
|
||||
- **Don't dispatch T64+T65 until T63 merges.** Both depend on the new `present_set_kind` column and the meanwhile event kinds.
|
||||
- **After each parallel wave**, run a code-review subagent (`subagent-driven-development` skill's two-stage review pattern) on each task before merging to `phase-3`. For purely mechanical tasks (schema migrations, projector handlers), a combined spec+quality review is acceptable. For T62, T64, T65 (large or integration tasks), use separate spec + quality reviewers.
|
||||
- **If a parallel wave's merge produces a conflict**, the wave's file-disjointness assumption was violated. Bisect the affected pair, fix the offending task in a follow-up commit on `phase-3`, and proceed.
|
||||
- **Schema-version test bumps** happen at Wave 1 merge (8 → 10) and Wave 6 merge (10 → 11 or 12 depending on T65's migration choice). Update `tests/test_world.py` once per affected merge — same pattern as Phase 2 T36.
|
||||
- **Token-spend rough estimate**: Phase 3 should be larger than Phase 2 (~1.5×) — events / skips / meanwhile each carry their own state + service + UI surfaces. Per-task token spend similar to Phase 2's larger tasks (T42, T44).
|
||||
- **DO NOT modify Phase 1 / 2 code paths** unless explicitly required (e.g., T58 modifies `scene_summarize.py` because the new behavior is genuinely additive). Existing 1- and 2-entity flows must continue to work end-to-end after each wave.
|
||||
@@ -0,0 +1,26 @@
|
||||
{
|
||||
"planPath": "docs/plans/2026-04-26-v3-phase3-implementation.md",
|
||||
"tasks": [
|
||||
{"id": 49, "subject": "T49: events table + lifecycle handlers", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||
{"id": 50, "subject": "T50: time_skip event kinds + chat-clock handlers", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||
{"id": 51, "subject": "T51: threads table + open/update/close handlers", "status": "pending", "wave": 1, "parallelGroup": "wave-1"},
|
||||
{"id": 52, "subject": "T52: event-lifecycle detection service", "status": "pending", "wave": 2, "parallelGroup": "wave-2", "blockedBy": [49]},
|
||||
{"id": 53, "subject": "T53: skip narration service (elision + jump)", "status": "pending", "wave": 2, "parallelGroup": "wave-2", "blockedBy": [50]},
|
||||
{"id": 54, "subject": "T54: synthesized-memories service for jump skips", "status": "pending", "wave": 2, "parallelGroup": "wave-2", "blockedBy": [50]},
|
||||
{"id": 55, "subject": "T55: thread-detection service", "status": "pending", "wave": 2, "parallelGroup": "wave-2", "blockedBy": [51]},
|
||||
{"id": 56, "subject": "T56: event-completion promotion service", "status": "pending", "wave": 3, "parallelGroup": "wave-3", "blockedBy": [49, 52]},
|
||||
{"id": 57, "subject": "T57: significance-aware retrieval ranking", "status": "pending", "wave": 3, "parallelGroup": "wave-3"},
|
||||
{"id": 58, "subject": "T58: scene compression keeps key quotes + emits thread events", "status": "pending", "wave": 3, "parallelGroup": "wave-3", "blockedBy": [55]},
|
||||
{"id": 59, "subject": "T59: drawer events / threads / skip controls", "status": "pending", "wave": 4, "parallelGroup": null, "blockedBy": [49, 50, 51, 53, 54]},
|
||||
{"id": 60, "subject": "T60: prompt assembly includes active events + open threads", "status": "pending", "wave": 5, "parallelGroup": "wave-5a", "blockedBy": [49, 51]},
|
||||
{"id": 61, "subject": "T61: turn flow invokes event-detection + completion promotion", "status": "pending", "wave": 5, "parallelGroup": "wave-5a", "blockedBy": [52, 56]},
|
||||
{"id": 62, "subject": "T62: skip command surface (parse + route + jump UI)", "status": "pending", "wave": 5, "parallelGroup": null, "blockedBy": [50, 53, 54, 60, 61]},
|
||||
{"id": 63, "subject": "T63: meanwhile scene config — schema + state", "status": "pending", "wave": 6, "parallelGroup": null},
|
||||
{"id": 64, "subject": "T64: meanwhile turn flow (host+guest, no you)", "status": "pending", "wave": 6, "parallelGroup": "wave-6b", "blockedBy": [63]},
|
||||
{"id": 65, "subject": "T65: meanwhile summary digest surfaces to next you-scene", "status": "pending", "wave": 6, "parallelGroup": "wave-6b", "blockedBy": [63]},
|
||||
{"id": 66, "subject": "T66: cross-feature integration tests", "status": "pending", "wave": 7, "parallelGroup": "wave-7", "blockedBy": [62, 64, 65]},
|
||||
{"id": 67, "subject": "T67: Phase 3 documentation update", "status": "pending", "wave": 7, "parallelGroup": "wave-7", "blockedBy": [62, 64, 65]}
|
||||
],
|
||||
"lastUpdated": "2026-04-26T00:00:00Z",
|
||||
"notes": "19 tasks across 8 waves (1, 2, 3, 4, 5a, 5b, 6a, 6b, 7). Waves 1, 2, 3, 5a, and 7 are fully parallel-safe (file-disjoint within each). Waves 4, 5b, and 6a are single-task. Wave 6b is parallel after 6a (T63) merges. Use Agent tool with isolation: 'worktree' to dispatch parallel tasks. Merge each wave's worktrees back into phase-3 before dispatching the next wave. See plan §Parallel-Execution Strategy for full guidance. Schema baseline: Phase 2 ends at version 8; Phase 3 adds 0009_events.sql, 0010_threads.sql, 0011_meanwhile_scenes.sql (final version 11)."
|
||||
}
|
||||
@@ -0,0 +1,322 @@
|
||||
"""T42: drawer guest add/remove + render.
|
||||
|
||||
The drawer grows a "Guest" section (when a guest bot is present in the
|
||||
chat), a "Group" section sourced from the ``group_node`` row, an
|
||||
"Add guest" form (visible while no guest is present), and a "Remove
|
||||
guest" button (visible while one is). The two new POST endpoints emit
|
||||
``guest_added`` / ``guest_removed`` events plus ancillary updates:
|
||||
|
||||
* ``POST /chats/{chat_id}/drawer/guest/add`` runs the relationship-seed
|
||||
classifier (T38) over the user-supplied prose and emits an
|
||||
``edge_update`` per direction when the seed comes back non-default.
|
||||
It also seeds a ``group_node_initialized`` row when none exists yet.
|
||||
* ``POST /chats/{chat_id}/drawer/guest/remove`` first emits
|
||||
``scene_closed`` for the active scene so the host -> you scene closes
|
||||
cleanly before the guest leaves.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
from fastapi.testclient import TestClient
|
||||
|
||||
from chat.app import app
|
||||
from chat.db.connection import open_db
|
||||
from chat.eventlog.log import append_event
|
||||
from chat.eventlog.projector import project
|
||||
from chat.llm.mock import MockLLMClient
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def client(tmp_path, monkeypatch):
|
||||
cfg = tmp_path / "config.toml"
|
||||
cfg.write_text('featherless_api_key = "test"\n')
|
||||
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||
db = tmp_path / "test.db"
|
||||
monkeypatch.setenv("CHAT_DB_PATH", str(db))
|
||||
with TestClient(app) as c:
|
||||
if hasattr(app.state, "background_worker"):
|
||||
app.state.background_worker.enabled = False
|
||||
yield c
|
||||
|
||||
|
||||
def _bot_payload(bot_id: str, name: str) -> dict:
|
||||
return {
|
||||
"id": bot_id,
|
||||
"name": name,
|
||||
"persona": "...",
|
||||
"voice_samples": [],
|
||||
"traits": [],
|
||||
"backstory": "",
|
||||
"initial_relationship_to_you": "",
|
||||
"kickoff_prose": "",
|
||||
}
|
||||
|
||||
|
||||
def _seed_chat(db: Path, *, with_scene: bool = True) -> None:
|
||||
"""Seed a chat hosted by ``bot_a`` (with ``bot_b`` authored as a
|
||||
candidate guest) and, by default, an open scene so the
|
||||
``guest_removed`` flow has something to close.
|
||||
"""
|
||||
with open_db(db) as conn:
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_a", "BotA"))
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_b", "BotB"))
|
||||
append_event(
|
||||
conn,
|
||||
kind="you_authored",
|
||||
payload={"name": "Me", "pronouns": "they/them", "persona": ""},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="chat_created",
|
||||
payload={
|
||||
"id": "chat_bot_a",
|
||||
"host_bot_id": "bot_a",
|
||||
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||
"narrative_anchor": "Day 1",
|
||||
"weather": "",
|
||||
},
|
||||
)
|
||||
if with_scene:
|
||||
append_event(
|
||||
conn,
|
||||
kind="scene_opened",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"container_id": None,
|
||||
"started_at": "2026-04-26T20:00:00+00:00",
|
||||
"participants": ["you", "bot_a"],
|
||||
},
|
||||
)
|
||||
project(conn)
|
||||
|
||||
|
||||
def _override_llm(canned: list[str]):
|
||||
"""Wire a ``MockLLMClient`` into the drawer's LLM dependency."""
|
||||
from chat.web.kickoff import get_llm_client
|
||||
|
||||
app.dependency_overrides[get_llm_client] = lambda: MockLLMClient(
|
||||
canned=list(canned)
|
||||
)
|
||||
|
||||
|
||||
def test_drawer_no_guest_omits_guest_section(client, tmp_path):
|
||||
_seed_chat(tmp_path / "test.db")
|
||||
response = client.get("/chats/chat_bot_a/drawer")
|
||||
assert response.status_code == 200
|
||||
body = response.text
|
||||
# No guest-section header; the "Add guest" form should be visible instead.
|
||||
assert "<h3>Guest</h3>" not in body
|
||||
assert "Add guest" in body
|
||||
|
||||
|
||||
def test_drawer_add_guest_seeds_edges_and_group_node(client, tmp_path):
|
||||
_seed_chat(tmp_path / "test.db")
|
||||
canned = json.dumps(
|
||||
{
|
||||
"a_to_b_summary": "old college friend",
|
||||
"a_to_b_knowledge_facts": ["studied physics together"],
|
||||
"a_to_b_affinity_delta": 4,
|
||||
"a_to_b_trust_delta": -1,
|
||||
"b_to_a_summary": "former roommate",
|
||||
"b_to_a_knowledge_facts": ["lived together junior year"],
|
||||
"b_to_a_affinity_delta": 3,
|
||||
"b_to_a_trust_delta": 0,
|
||||
}
|
||||
)
|
||||
_override_llm([canned])
|
||||
try:
|
||||
response = client.post(
|
||||
"/chats/chat_bot_a/drawer/guest/add",
|
||||
data={
|
||||
"guest_bot_id": "bot_b",
|
||||
"relationship_prose": (
|
||||
"Alice and Bob met in college and studied physics together."
|
||||
),
|
||||
},
|
||||
)
|
||||
assert response.status_code == 200
|
||||
finally:
|
||||
app.dependency_overrides.clear()
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
from chat.state.edges import get_edge
|
||||
from chat.state.group_node import get_group_node
|
||||
from chat.state.world import get_chat
|
||||
|
||||
chat = get_chat(conn, "chat_bot_a")
|
||||
assert chat["guest_bot_id"] == "bot_b"
|
||||
|
||||
edge_a_to_b = get_edge(conn, "bot_a", "bot_b")
|
||||
edge_b_to_a = get_edge(conn, "bot_b", "bot_a")
|
||||
# Seed deltas applied around the 50/50 default.
|
||||
assert edge_a_to_b["affinity"] == 54
|
||||
assert edge_a_to_b["trust"] == 49
|
||||
assert "studied physics together" in edge_a_to_b["knowledge"]
|
||||
assert edge_b_to_a["affinity"] == 53
|
||||
assert edge_b_to_a["trust"] == 50
|
||||
assert "lived together junior year" in edge_b_to_a["knowledge"]
|
||||
|
||||
group = get_group_node(conn, "chat_bot_a")
|
||||
assert group is not None
|
||||
assert set(group["members"]) == {"you", "bot_a", "bot_b"}
|
||||
|
||||
|
||||
def test_drawer_add_guest_empty_prose_skips_edge_update(client, tmp_path):
|
||||
_seed_chat(tmp_path / "test.db")
|
||||
# No canned responses: the seed function short-circuits on empty prose
|
||||
# so no LLM call should happen.
|
||||
_override_llm([])
|
||||
try:
|
||||
response = client.post(
|
||||
"/chats/chat_bot_a/drawer/guest/add",
|
||||
data={"guest_bot_id": "bot_b", "relationship_prose": " "},
|
||||
)
|
||||
assert response.status_code == 200
|
||||
finally:
|
||||
app.dependency_overrides.clear()
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
from chat.state.world import get_chat
|
||||
|
||||
chat = get_chat(conn, "chat_bot_a")
|
||||
assert chat["guest_bot_id"] == "bot_b"
|
||||
|
||||
# guest_added fires but no edge_update events between bot_a and bot_b.
|
||||
added = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log WHERE kind = 'guest_added'"
|
||||
).fetchone()[0]
|
||||
assert added == 1
|
||||
|
||||
edge_updates = conn.execute(
|
||||
"SELECT payload_json FROM event_log WHERE kind = 'edge_update'"
|
||||
).fetchall()
|
||||
for (payload_json,) in edge_updates:
|
||||
payload = json.loads(payload_json)
|
||||
pair = {payload.get("source_id"), payload.get("target_id")}
|
||||
assert pair != {"bot_a", "bot_b"}, (
|
||||
"no edge_update should be emitted between host and guest "
|
||||
"when prose is empty"
|
||||
)
|
||||
|
||||
|
||||
def test_drawer_add_guest_when_already_present_returns_400(client, tmp_path):
|
||||
_seed_chat(tmp_path / "test.db")
|
||||
# Pre-attach a guest directly via append_and_apply so we don't replay
|
||||
# the prior chat_created (which would violate UNIQUE on chats.id).
|
||||
from chat.eventlog.log import append_and_apply
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="bot_authored",
|
||||
payload=_bot_payload("bot_c", "BotC"),
|
||||
)
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="guest_added",
|
||||
payload={"chat_id": "chat_bot_a", "guest_bot_id": "bot_b"},
|
||||
)
|
||||
|
||||
_override_llm([])
|
||||
try:
|
||||
response = client.post(
|
||||
"/chats/chat_bot_a/drawer/guest/add",
|
||||
data={"guest_bot_id": "bot_c", "relationship_prose": ""},
|
||||
)
|
||||
assert response.status_code == 400
|
||||
finally:
|
||||
app.dependency_overrides.clear()
|
||||
|
||||
|
||||
def test_drawer_remove_guest_clears_and_closes_scene(client, tmp_path):
|
||||
_seed_chat(tmp_path / "test.db")
|
||||
from chat.eventlog.log import append_and_apply
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="guest_added",
|
||||
payload={"chat_id": "chat_bot_a", "guest_bot_id": "bot_b"},
|
||||
)
|
||||
|
||||
response = client.post("/chats/chat_bot_a/drawer/guest/remove")
|
||||
assert response.status_code == 200
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
from chat.state.world import active_scene, get_chat
|
||||
|
||||
chat = get_chat(conn, "chat_bot_a")
|
||||
assert chat["guest_bot_id"] is None
|
||||
assert active_scene(conn, "chat_bot_a") is None
|
||||
|
||||
kinds = [
|
||||
row[0]
|
||||
for row in conn.execute(
|
||||
"SELECT kind FROM event_log ORDER BY id"
|
||||
).fetchall()
|
||||
]
|
||||
# scene_closed must precede guest_removed in the log.
|
||||
assert "scene_closed" in kinds
|
||||
assert "guest_removed" in kinds
|
||||
assert kinds.index("scene_closed") < kinds.index("guest_removed")
|
||||
|
||||
|
||||
def test_drawer_with_guest_renders_guest_and_group_sections(client, tmp_path):
|
||||
_seed_chat(tmp_path / "test.db")
|
||||
from chat.eventlog.log import append_and_apply
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="guest_added",
|
||||
payload={"chat_id": "chat_bot_a", "guest_bot_id": "bot_b"},
|
||||
)
|
||||
# Activity for the guest so the section has content to render.
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="activity_change",
|
||||
payload={
|
||||
"entity_id": "bot_b",
|
||||
"posture": "leaning",
|
||||
"action": {"verb": "smirking"},
|
||||
"attention": "BotA",
|
||||
},
|
||||
)
|
||||
# Edges in all four directions involving the guest.
|
||||
for src, tgt in (("bot_a", "bot_b"), ("bot_b", "bot_a"), ("you", "bot_b"), ("bot_b", "you")):
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": src,
|
||||
"target_id": tgt,
|
||||
"chat_id": "chat_bot_a",
|
||||
"affinity_delta": 1,
|
||||
},
|
||||
)
|
||||
append_and_apply(
|
||||
conn,
|
||||
kind="group_node_initialized",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"members": ["you", "bot_a", "bot_b"],
|
||||
"summary": "Three friends catching up over drinks.",
|
||||
"dynamic": "warm and conspiratorial",
|
||||
},
|
||||
)
|
||||
|
||||
response = client.get("/chats/chat_bot_a/drawer")
|
||||
assert response.status_code == 200
|
||||
body = response.text
|
||||
assert "<h3>Guest</h3>" in body
|
||||
assert "BotB" in body
|
||||
assert "smirking" in body
|
||||
assert "<h3>Group</h3>" in body
|
||||
assert "Three friends catching up over drinks." in body
|
||||
assert "warm and conspiratorial" in body
|
||||
# "Remove guest" button is visible when a guest is present.
|
||||
assert "Remove guest" in body
|
||||
@@ -0,0 +1,101 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from chat.db.connection import open_db
|
||||
from chat.db.migrate import apply_migrations
|
||||
from chat.eventlog.log import append_event
|
||||
from chat.eventlog.projector import project
|
||||
import chat.state.entities # registers handlers
|
||||
import chat.state.world # registers handlers
|
||||
import chat.state.group_node # registers handlers
|
||||
from chat.state.group_node import get_group_node
|
||||
|
||||
|
||||
def _bot_payload(bot_id: str, name: str) -> dict:
|
||||
return {
|
||||
"id": bot_id,
|
||||
"name": name,
|
||||
"persona": "thoughtful, observant",
|
||||
"voice_samples": [],
|
||||
"traits": [],
|
||||
"backstory": "",
|
||||
"initial_relationship_to_you": "coworker",
|
||||
"kickoff_prose": "",
|
||||
}
|
||||
|
||||
|
||||
def _chat_payload(chat_id: str = "chat_bot_a") -> dict:
|
||||
return {
|
||||
"id": chat_id,
|
||||
"host_bot_id": "bot_a",
|
||||
"guest_bot_id": "bot_b",
|
||||
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||
"narrative_anchor": "Day 1 evening",
|
||||
"weather": "clear",
|
||||
}
|
||||
|
||||
|
||||
def test_group_node_initialized_creates_row(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_a", "BotA"))
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_b", "BotB"))
|
||||
append_event(conn, kind="chat_created", payload=_chat_payload())
|
||||
append_event(
|
||||
conn,
|
||||
kind="group_node_initialized",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"members": ["you", "bot_a", "bot_b"],
|
||||
},
|
||||
)
|
||||
project(conn)
|
||||
|
||||
gn = get_group_node(conn, "chat_bot_a")
|
||||
assert gn is not None
|
||||
assert gn["chat_id"] == "chat_bot_a"
|
||||
assert gn["members"] == ["you", "bot_a", "bot_b"]
|
||||
assert gn["summary"] == ""
|
||||
assert gn["dynamic"] == ""
|
||||
assert gn["threads"] == []
|
||||
|
||||
|
||||
def test_group_node_updated_changes_summary_and_dynamic(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_a", "BotA"))
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_b", "BotB"))
|
||||
append_event(conn, kind="chat_created", payload=_chat_payload())
|
||||
append_event(
|
||||
conn,
|
||||
kind="group_node_initialized",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"members": ["you", "bot_a", "bot_b"],
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="group_node_updated",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"summary": "Three coworkers chatting about the project.",
|
||||
"dynamic": "Tense but cordial.",
|
||||
},
|
||||
)
|
||||
project(conn)
|
||||
|
||||
gn = get_group_node(conn, "chat_bot_a")
|
||||
assert gn is not None
|
||||
assert gn["summary"] == "Three coworkers chatting about the project."
|
||||
assert gn["dynamic"] == "Tense but cordial."
|
||||
# Members preserved across update
|
||||
assert gn["members"] == ["you", "bot_a", "bot_b"]
|
||||
|
||||
|
||||
def test_get_group_node_returns_none_for_missing_chat(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
assert get_group_node(conn, "chat_missing") is None
|
||||
@@ -0,0 +1,96 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from chat.db.connection import open_db
|
||||
from chat.db.migrate import apply_migrations
|
||||
from chat.eventlog.log import append_event
|
||||
from chat.eventlog.projector import project
|
||||
import chat.state.entities # registers bot_authored handler
|
||||
import chat.state.world # registers chat_created / guest_added / guest_removed
|
||||
from chat.state.world import get_chat
|
||||
|
||||
|
||||
def _bot_payload(bot_id: str, name: str) -> dict:
|
||||
return {
|
||||
"id": bot_id,
|
||||
"name": name,
|
||||
"persona": "...",
|
||||
"voice_samples": ["sample"],
|
||||
"traits": ["shy"],
|
||||
"backstory": "...",
|
||||
"initial_relationship_to_you": "coworker",
|
||||
"kickoff_prose": "you stay late",
|
||||
}
|
||||
|
||||
|
||||
def _chat_payload(**overrides) -> dict:
|
||||
payload = {
|
||||
"id": "chat_bot_a",
|
||||
"host_bot_id": "bot_a",
|
||||
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||
"narrative_anchor": "Day 1 evening",
|
||||
"weather": "clear",
|
||||
}
|
||||
payload.update(overrides)
|
||||
return payload
|
||||
|
||||
|
||||
def test_guest_added_sets_guest_bot_id(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_a", "BotA"))
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_b", "BotB"))
|
||||
append_event(conn, kind="chat_created", payload=_chat_payload())
|
||||
append_event(conn, kind="guest_added", payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"guest_bot_id": "bot_b",
|
||||
})
|
||||
project(conn)
|
||||
|
||||
chat = get_chat(conn, "chat_bot_a")
|
||||
assert chat is not None
|
||||
assert chat["guest_bot_id"] == "bot_b"
|
||||
|
||||
|
||||
def test_guest_removed_clears_guest_bot_id(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_a", "BotA"))
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_b", "BotB"))
|
||||
append_event(conn, kind="chat_created", payload=_chat_payload())
|
||||
append_event(conn, kind="guest_added", payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"guest_bot_id": "bot_b",
|
||||
})
|
||||
append_event(conn, kind="guest_removed", payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
})
|
||||
project(conn)
|
||||
|
||||
chat = get_chat(conn, "chat_bot_a")
|
||||
assert chat is not None
|
||||
assert chat["guest_bot_id"] is None
|
||||
|
||||
|
||||
def test_guest_added_idempotent_overwrite(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_a", "BotA"))
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_b", "BotB"))
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_c", "BotC"))
|
||||
append_event(conn, kind="chat_created", payload=_chat_payload())
|
||||
append_event(conn, kind="guest_added", payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"guest_bot_id": "bot_b",
|
||||
})
|
||||
append_event(conn, kind="guest_added", payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"guest_bot_id": "bot_c",
|
||||
})
|
||||
project(conn)
|
||||
|
||||
chat = get_chat(conn, "chat_bot_a")
|
||||
assert chat is not None
|
||||
assert chat["guest_bot_id"] == "bot_c"
|
||||
@@ -0,0 +1,89 @@
|
||||
"""Tests for the interjection classifier service (T39).
|
||||
|
||||
Per Requirements §6.2, when a guest is present and the addressee bot has
|
||||
just spoken, the *non-addressee* bot may interject with a brief follow-on
|
||||
beat. The classifier wrapped here decides whether that interjection
|
||||
should fire. The default bias is strongly toward False — the addressee
|
||||
has the floor — so an interjection only fires when the silent witness's
|
||||
character would plausibly speak up.
|
||||
|
||||
These tests cover:
|
||||
|
||||
* The classifier returning ``should_interject=True`` is honored.
|
||||
* The classifier returning ``should_interject=False`` is honored.
|
||||
* Repeated invalid JSON exhausts the classifier retries and falls back
|
||||
to ``should_interject=False`` with ``reason="fallback"``.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
|
||||
import pytest
|
||||
|
||||
from chat.llm.mock import MockLLMClient
|
||||
from chat.services.interjection import (
|
||||
InterjectionDecision,
|
||||
detect_interjection,
|
||||
)
|
||||
|
||||
|
||||
def _kwargs() -> dict:
|
||||
"""Reasonable, non-empty kwargs for ``detect_interjection``."""
|
||||
return dict(
|
||||
classifier_model="x",
|
||||
addressee_name="Alice",
|
||||
addressee_just_said="I think we should leave now.",
|
||||
silent_witness_name="Bob",
|
||||
silent_witness_persona="Skeptical engineer, blunt, protective of the user.",
|
||||
silent_witness_edge_to_addressee={
|
||||
"affinity": 40,
|
||||
"trust": 30,
|
||||
"summary": "old rival; mild distrust",
|
||||
},
|
||||
silent_witness_edge_to_you={
|
||||
"affinity": 70,
|
||||
"trust": 80,
|
||||
"summary": "long-time confidant",
|
||||
},
|
||||
you_just_said="Where do you both think we should go?",
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_interjection_returns_true_when_classifier_decides_yes():
|
||||
canned = json.dumps({"should_interject": True, "reason": "jealousy"})
|
||||
mock = MockLLMClient(canned=[canned])
|
||||
result = await detect_interjection(mock, **_kwargs())
|
||||
assert isinstance(result, InterjectionDecision)
|
||||
assert result.should_interject is True
|
||||
assert result.reason == "jealousy"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_interjection_returns_false_when_classifier_decides_no():
|
||||
canned = json.dumps(
|
||||
{"should_interject": False, "reason": "addressee has the floor"}
|
||||
)
|
||||
mock = MockLLMClient(canned=[canned])
|
||||
result = await detect_interjection(mock, **_kwargs())
|
||||
assert isinstance(result, InterjectionDecision)
|
||||
assert result.should_interject is False
|
||||
assert result.reason == "addressee has the floor"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_interjection_falls_back_to_false_on_classifier_failure():
|
||||
"""``classify`` retries 3 times; after all fail it returns the default.
|
||||
|
||||
The default carries ``should_interject=False`` and
|
||||
``reason="fallback"`` so callers can tell a real "no" from a
|
||||
classifier-degraded "no" if they ever care to.
|
||||
"""
|
||||
mock = MockLLMClient(
|
||||
canned=["this is not json", "still not json", "still not json"]
|
||||
)
|
||||
result = await detect_interjection(mock, **_kwargs())
|
||||
assert isinstance(result, InterjectionDecision)
|
||||
assert result.should_interject is False
|
||||
assert result.reason == "fallback"
|
||||
+150
-1
@@ -22,7 +22,7 @@ from chat.db.migrate import apply_migrations
|
||||
from chat.eventlog.log import append_event
|
||||
from chat.eventlog.projector import project
|
||||
from chat.llm.mock import MockLLMClient
|
||||
from chat.services.memory_write import record_turn_memory
|
||||
from chat.services.memory_write import record_turn_memory, record_turn_memory_for_present
|
||||
import chat.state.entities # noqa: F401 - register handlers
|
||||
import chat.state.memory # noqa: F401
|
||||
import chat.state.world # noqa: F401
|
||||
@@ -295,3 +295,152 @@ def test_post_turn_writes_memory_for_host_bot(client, tmp_path):
|
||||
assert w_guest == 0
|
||||
assert source == "direct"
|
||||
assert sig == 1
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# T41: record_turn_memory_for_present — multi-witness helper.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _seed_two_bots(db_path: Path) -> None:
|
||||
"""Author host + guest bots and create a two-bot chat."""
|
||||
with open_db(db_path) as conn:
|
||||
for bot_id, name in (("bot_a", "BotA"), ("bot_b", "BotB")):
|
||||
append_event(
|
||||
conn,
|
||||
kind="bot_authored",
|
||||
payload={
|
||||
"id": bot_id,
|
||||
"name": name,
|
||||
"persona": "...",
|
||||
"voice_samples": [],
|
||||
"traits": [],
|
||||
"backstory": "",
|
||||
"initial_relationship_to_you": "",
|
||||
"kickoff_prose": "",
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="chat_created",
|
||||
payload={
|
||||
"id": "chat_ab",
|
||||
"host_bot_id": "bot_a",
|
||||
"guest_bot_id": "bot_b",
|
||||
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||
"narrative_anchor": "Day 1",
|
||||
"weather": "",
|
||||
},
|
||||
)
|
||||
project(conn)
|
||||
|
||||
|
||||
def test_record_for_present_no_guest_writes_single_memory_with_witness_1_1_0(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
_seed_minimal(db)
|
||||
with open_db(db) as conn:
|
||||
result = record_turn_memory_for_present(
|
||||
conn,
|
||||
chat_id="chat_bot_a",
|
||||
host_bot_id="bot_a",
|
||||
guest_bot_id=None,
|
||||
narrative_text="BotA glances out the window.",
|
||||
scene_id=None,
|
||||
chat_clock_at="2026-04-26T20:00:00+00:00",
|
||||
)
|
||||
|
||||
# Returned dict has only the host key.
|
||||
assert set(result.keys()) == {"bot_a"}
|
||||
eid_h, mid_h = result["bot_a"]
|
||||
assert eid_h > 0
|
||||
assert mid_h is not None and mid_h > 0
|
||||
|
||||
rows = conn.execute(
|
||||
"SELECT owner_id, witness_you, witness_host, witness_guest "
|
||||
"FROM memories"
|
||||
).fetchall()
|
||||
assert len(rows) == 1
|
||||
owner_id, w_you, w_host, w_guest = rows[0]
|
||||
assert owner_id == "bot_a"
|
||||
assert w_you == 1
|
||||
assert w_host == 1
|
||||
assert w_guest == 0
|
||||
|
||||
# Exactly one memory_written event was appended.
|
||||
cur = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log WHERE kind = 'memory_written'"
|
||||
)
|
||||
assert cur.fetchone()[0] == 1
|
||||
|
||||
|
||||
def test_record_for_present_with_guest_writes_two_memories_with_witness_1_1_1(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
_seed_two_bots(db)
|
||||
with open_db(db) as conn:
|
||||
result = record_turn_memory_for_present(
|
||||
conn,
|
||||
chat_id="chat_ab",
|
||||
host_bot_id="bot_a",
|
||||
guest_bot_id="bot_b",
|
||||
narrative_text="BotA and BotB share a glance.",
|
||||
scene_id=None,
|
||||
chat_clock_at="2026-04-26T20:00:00+00:00",
|
||||
)
|
||||
|
||||
# Returned dict has both keys.
|
||||
assert set(result.keys()) == {"bot_a", "bot_b"}
|
||||
eid_h, mid_h = result["bot_a"]
|
||||
eid_g, mid_g = result["bot_b"]
|
||||
assert eid_h > 0 and eid_g > 0
|
||||
assert mid_h is not None and mid_h > 0
|
||||
assert mid_g is not None and mid_g > 0
|
||||
# Distinct event ids and memory ids.
|
||||
assert eid_h != eid_g
|
||||
assert mid_h != mid_g
|
||||
|
||||
rows = conn.execute(
|
||||
"SELECT owner_id, witness_you, witness_host, witness_guest "
|
||||
"FROM memories ORDER BY owner_id"
|
||||
).fetchall()
|
||||
assert len(rows) == 2
|
||||
owners = {r[0] for r in rows}
|
||||
assert owners == {"bot_a", "bot_b"}
|
||||
# All rows should have witness mask [1, 1, 1].
|
||||
for _owner, w_you, w_host, w_guest in rows:
|
||||
assert w_you == 1
|
||||
assert w_host == 1
|
||||
assert w_guest == 1
|
||||
|
||||
# Two memory_written events were appended.
|
||||
cur = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log WHERE kind = 'memory_written'"
|
||||
)
|
||||
assert cur.fetchone()[0] == 2
|
||||
|
||||
|
||||
def test_record_for_present_dict_keys_match(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
_seed_two_bots(db)
|
||||
with open_db(db) as conn:
|
||||
# No guest: keys == {host_bot_id}.
|
||||
result_no_guest = record_turn_memory_for_present(
|
||||
conn,
|
||||
chat_id="chat_ab",
|
||||
host_bot_id="bot_a",
|
||||
guest_bot_id=None,
|
||||
narrative_text="Just BotA's POV.",
|
||||
)
|
||||
assert set(result_no_guest.keys()) == {"bot_a"}
|
||||
|
||||
# With guest: keys == {host_bot_id, guest_bot_id}.
|
||||
result_with_guest = record_turn_memory_for_present(
|
||||
conn,
|
||||
chat_id="chat_ab",
|
||||
host_bot_id="bot_a",
|
||||
guest_bot_id="bot_b",
|
||||
narrative_text="Both bots witness this.",
|
||||
)
|
||||
assert set(result_with_guest.keys()) == {"bot_a", "bot_b"}
|
||||
|
||||
@@ -0,0 +1,147 @@
|
||||
"""Multi-entity state-update coordinator (T40).
|
||||
|
||||
Wraps the single-pair :func:`compute_state_update` to run state updates
|
||||
for ALL directed pairs of present entities. With 3 present entities
|
||||
(you, host, guest) that's 6 directed pairs; with 2 (you, host) it's 2.
|
||||
|
||||
Calls run sequentially to respect Featherless's 2-connection cap.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
|
||||
import pytest
|
||||
|
||||
from chat.llm.mock import MockLLMClient
|
||||
from chat.services.multi_state_update import compute_state_updates_for_present
|
||||
from chat.services.state_update import StateUpdate
|
||||
|
||||
|
||||
def _canned_update(affinity: int, trust: int, facts: list[str] | None = None) -> str:
|
||||
return json.dumps(
|
||||
{
|
||||
"affinity_delta": affinity,
|
||||
"trust_delta": trust,
|
||||
"knowledge_facts": facts or [],
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_two_entities_returns_two_updates():
|
||||
"""you + bot_a -> 2 directed pairs (you->bot_a, bot_a->you)."""
|
||||
canned = [
|
||||
_canned_update(2, 1, ["likes coffee"]), # you -> bot_a
|
||||
_canned_update(1, 0, ["greets warmly"]), # bot_a -> you
|
||||
]
|
||||
mock = MockLLMClient(canned=canned)
|
||||
|
||||
results = await compute_state_updates_for_present(
|
||||
mock,
|
||||
classifier_model="x",
|
||||
present_ids=["you", "bot_a"],
|
||||
present_names={"you": "Me", "bot_a": "BotA"},
|
||||
personas={"you": "", "bot_a": "thoughtful"},
|
||||
prior_edges={
|
||||
("you", "bot_a"): {"affinity": 50, "trust": 50, "summary": ""},
|
||||
("bot_a", "you"): {"affinity": 50, "trust": 50, "summary": ""},
|
||||
},
|
||||
recent_dialogue=[
|
||||
{"speaker": "you", "text": "hi"},
|
||||
{"speaker": "BotA", "text": "Hello!"},
|
||||
],
|
||||
)
|
||||
|
||||
assert len(results) == 2
|
||||
assert results[0][0] == "you"
|
||||
assert results[0][1] == "bot_a"
|
||||
assert isinstance(results[0][2], StateUpdate)
|
||||
assert results[0][2].affinity_delta == 2
|
||||
assert results[0][2].trust_delta == 1
|
||||
assert results[0][2].knowledge_facts == ["likes coffee"]
|
||||
|
||||
assert results[1][0] == "bot_a"
|
||||
assert results[1][1] == "you"
|
||||
assert isinstance(results[1][2], StateUpdate)
|
||||
assert results[1][2].affinity_delta == 1
|
||||
assert results[1][2].trust_delta == 0
|
||||
assert results[1][2].knowledge_facts == ["greets warmly"]
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_three_entities_returns_six_updates():
|
||||
"""you + bot_a + bot_b -> 6 directed pairs (no self-pairs)."""
|
||||
canned = [_canned_update(i, 0) for i in range(6)]
|
||||
mock = MockLLMClient(canned=canned)
|
||||
|
||||
results = await compute_state_updates_for_present(
|
||||
mock,
|
||||
classifier_model="x",
|
||||
present_ids=["you", "bot_a", "bot_b"],
|
||||
present_names={"you": "Me", "bot_a": "BotA", "bot_b": "BotB"},
|
||||
personas={"you": "", "bot_a": "thoughtful", "bot_b": "cheerful"},
|
||||
prior_edges={}, # all default to 50/50/""
|
||||
recent_dialogue=[{"speaker": "you", "text": "hello all"}],
|
||||
)
|
||||
|
||||
assert len(results) == 6
|
||||
|
||||
pairs = [(src, tgt) for src, tgt, _ in results]
|
||||
# No self-pairs.
|
||||
assert all(src != tgt for src, tgt in pairs)
|
||||
# All 6 directed combinations present.
|
||||
expected = {
|
||||
("you", "bot_a"),
|
||||
("you", "bot_b"),
|
||||
("bot_a", "you"),
|
||||
("bot_a", "bot_b"),
|
||||
("bot_b", "you"),
|
||||
("bot_b", "bot_a"),
|
||||
}
|
||||
assert set(pairs) == expected
|
||||
# Every entry is a StateUpdate.
|
||||
assert all(isinstance(u, StateUpdate) for _, _, u in results)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_failure_in_one_pair_does_not_kill_batch():
|
||||
"""First pair fails all 3 classify retries -> default; second parses OK."""
|
||||
canned = [
|
||||
# Pair 1 (you -> bot_a): 3 malformed responses -> default StateUpdate.
|
||||
"bad",
|
||||
"still bad",
|
||||
"nope",
|
||||
# Pair 2 (bot_a -> you): valid JSON.
|
||||
_canned_update(3, 2, ["was warm"]),
|
||||
]
|
||||
mock = MockLLMClient(canned=canned)
|
||||
|
||||
results = await compute_state_updates_for_present(
|
||||
mock,
|
||||
classifier_model="x",
|
||||
present_ids=["you", "bot_a"],
|
||||
present_names={"you": "Me", "bot_a": "BotA"},
|
||||
personas={"you": "", "bot_a": "thoughtful"},
|
||||
prior_edges={
|
||||
("you", "bot_a"): {"affinity": 60, "trust": 40, "summary": "some prior"},
|
||||
("bot_a", "you"): {"affinity": 50, "trust": 50, "summary": ""},
|
||||
},
|
||||
recent_dialogue=[{"speaker": "you", "text": "hi"}],
|
||||
)
|
||||
|
||||
assert len(results) == 2
|
||||
|
||||
# First pair: default (zero-delta) StateUpdate.
|
||||
src1, tgt1, update1 = results[0]
|
||||
assert (src1, tgt1) == ("you", "bot_a")
|
||||
assert update1.affinity_delta == 0
|
||||
assert update1.trust_delta == 0
|
||||
assert update1.knowledge_facts == []
|
||||
|
||||
# Second pair: parsed valid JSON.
|
||||
src2, tgt2, update2 = results[1]
|
||||
assert (src2, tgt2) == ("bot_a", "you")
|
||||
assert update2.affinity_delta == 3
|
||||
assert update2.trust_delta == 2
|
||||
assert update2.knowledge_facts == ["was warm"]
|
||||
@@ -258,3 +258,425 @@ async def test_apply_scene_close_summary_updates_memories_and_edge(tmp_path):
|
||||
|
||||
# Knowledge fact appended via edge_update.
|
||||
assert any("deadline" in fact for fact in edge["knowledge"])
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# T45: per-POV summaries on close for each present witness.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _bot_payload(bot_id: str, name: str, persona: str = "thoughtful") -> dict:
|
||||
return {
|
||||
"id": bot_id,
|
||||
"name": name,
|
||||
"persona": persona,
|
||||
"voice_samples": [],
|
||||
"traits": [],
|
||||
"backstory": "",
|
||||
"initial_relationship_to_you": "",
|
||||
"kickoff_prose": "",
|
||||
}
|
||||
|
||||
|
||||
def _seed_single_bot_scene(conn) -> None:
|
||||
"""Seed the canonical Phase 1 single-bot scene used by the regression test."""
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_a", "BotA"))
|
||||
append_event(
|
||||
conn,
|
||||
kind="you_authored",
|
||||
payload={"name": "Me", "pronouns": "they/them", "persona": "engineer"},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="chat_created",
|
||||
payload={
|
||||
"id": "chat_bot_a",
|
||||
"host_bot_id": "bot_a",
|
||||
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||
"narrative_anchor": "Day 1",
|
||||
"weather": "",
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="container_created",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"name": "office",
|
||||
"type": "workplace",
|
||||
"properties": {},
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="scene_opened",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"container_id": 1,
|
||||
"started_at": "2026-04-26T20:00:00+00:00",
|
||||
"participants": ["you", "bot_a"],
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": "bot_a",
|
||||
"target_id": "you",
|
||||
"chat_id": "chat_bot_a",
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="memory_written",
|
||||
payload={
|
||||
"owner_id": "bot_a",
|
||||
"chat_id": "chat_bot_a",
|
||||
"scene_id": 1,
|
||||
"pov_summary": "Original raw narrative (host)",
|
||||
"witness_you": 1,
|
||||
"witness_host": 1,
|
||||
"witness_guest": 0,
|
||||
"significance": 1,
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="user_turn",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"prose": "Quick chat about the deadline",
|
||||
"segments": [],
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="assistant_turn",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"speaker_id": "bot_a",
|
||||
"text": "It's going to be okay.",
|
||||
"truncated": False,
|
||||
"user_turn_id": 1,
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
def _seed_two_bot_scene(conn, *, with_group_node: bool = False) -> None:
|
||||
"""Seed a host+guest scene with bot_a (host) and bot_b (guest), plus a
|
||||
memory row per bot owner so each per-POV update has something to rewrite,
|
||||
and seeded directed edges from each bot to ``you`` so each edge_summary
|
||||
update has a row to operate on. Optionally seeds the group_node row too.
|
||||
"""
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_a", "BotA"))
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_b", "BotB"))
|
||||
append_event(
|
||||
conn,
|
||||
kind="you_authored",
|
||||
payload={"name": "Me", "pronouns": "they/them", "persona": "engineer"},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="chat_created",
|
||||
payload={
|
||||
"id": "chat_bot_a",
|
||||
"host_bot_id": "bot_a",
|
||||
"guest_bot_id": "bot_b",
|
||||
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||
"narrative_anchor": "Day 1",
|
||||
"weather": "",
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="container_created",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"name": "office",
|
||||
"type": "workplace",
|
||||
"properties": {},
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="scene_opened",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"container_id": 1,
|
||||
"started_at": "2026-04-26T20:00:00+00:00",
|
||||
"participants": ["you", "bot_a", "bot_b"],
|
||||
},
|
||||
)
|
||||
# Seed edges in both bot -> you directions so the edge_summary updates
|
||||
# have rows to target.
|
||||
append_event(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": "bot_a",
|
||||
"target_id": "you",
|
||||
"chat_id": "chat_bot_a",
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": "bot_b",
|
||||
"target_id": "you",
|
||||
"chat_id": "chat_bot_a",
|
||||
},
|
||||
)
|
||||
# One memory per witness, scene 1.
|
||||
append_event(
|
||||
conn,
|
||||
kind="memory_written",
|
||||
payload={
|
||||
"owner_id": "bot_a",
|
||||
"chat_id": "chat_bot_a",
|
||||
"scene_id": 1,
|
||||
"pov_summary": "Original raw narrative (host)",
|
||||
"witness_you": 1,
|
||||
"witness_host": 1,
|
||||
"witness_guest": 1,
|
||||
"significance": 1,
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="memory_written",
|
||||
payload={
|
||||
"owner_id": "bot_b",
|
||||
"chat_id": "chat_bot_a",
|
||||
"scene_id": 1,
|
||||
"pov_summary": "Original raw narrative (guest)",
|
||||
"witness_you": 1,
|
||||
"witness_host": 1,
|
||||
"witness_guest": 1,
|
||||
"significance": 1,
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="user_turn",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"prose": "Three of us in the office.",
|
||||
"segments": [],
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="assistant_turn",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"speaker_id": "bot_a",
|
||||
"text": "Glad you're both here.",
|
||||
"truncated": False,
|
||||
"user_turn_id": 1,
|
||||
},
|
||||
)
|
||||
if with_group_node:
|
||||
append_event(
|
||||
conn,
|
||||
kind="group_node_initialized",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"members": ["you", "bot_a", "bot_b"],
|
||||
"summary": "",
|
||||
"dynamic": "",
|
||||
"threads": [],
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_close_with_no_guest_matches_phase1(tmp_path):
|
||||
"""Regression: when guest_bot_id is None, the close summary path runs
|
||||
summarize_scene exactly once and rewrites the host's memory + host->you
|
||||
edge in place — same as Phase 1 behavior."""
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
canned = json.dumps(
|
||||
{
|
||||
"summary": "BotA helped you talk through the deadline anxiety.",
|
||||
"knowledge_facts": ["Deadline next Friday."],
|
||||
"relationship_summary": "BotA leaned in supportively.",
|
||||
}
|
||||
)
|
||||
with open_db(db) as conn:
|
||||
_seed_single_bot_scene(conn)
|
||||
project(conn)
|
||||
|
||||
# canned has 2 entries to detect any over-call; the assertion below
|
||||
# confirms only one was consumed.
|
||||
client = MockLLMClient(canned=[canned, canned])
|
||||
await apply_scene_close_summary(
|
||||
conn,
|
||||
client,
|
||||
classifier_model="x",
|
||||
chat_id="chat_bot_a",
|
||||
scene_id=1,
|
||||
host_bot_id="bot_a",
|
||||
)
|
||||
|
||||
# Exactly one classifier call -> exactly one canned entry consumed,
|
||||
# leaving the second untouched.
|
||||
assert len(client._canned) == 1
|
||||
|
||||
# Host memory rewritten with the per-POV summary content.
|
||||
new_pov = conn.execute(
|
||||
"SELECT pov_summary FROM memories "
|
||||
"WHERE owner_id = 'bot_a' AND scene_id = 1"
|
||||
).fetchone()[0]
|
||||
assert "BotA helped" in new_pov
|
||||
|
||||
# host->you edge summary rewritten with the relationship_summary.
|
||||
from chat.state.edges import get_edge
|
||||
|
||||
edge = get_edge(conn, "bot_a", "you")
|
||||
assert "supportively" in edge["summary"]
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_close_with_guest_calls_summarize_twice(tmp_path):
|
||||
"""When a guest is present, summarize_scene runs once per witness
|
||||
(host + guest) and each bot's memory rewrite uses its own POV summary."""
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
host_canned = json.dumps(
|
||||
{
|
||||
"summary": "BotA noticed BotB warming up to you.",
|
||||
"knowledge_facts": ["You sketched on the whiteboard."],
|
||||
"relationship_summary": "BotA felt steady around you.",
|
||||
}
|
||||
)
|
||||
guest_canned = json.dumps(
|
||||
{
|
||||
"summary": "BotB found the office quieter than expected.",
|
||||
"knowledge_facts": ["You prefer black coffee."],
|
||||
"relationship_summary": "BotB warmed up to you a little.",
|
||||
}
|
||||
)
|
||||
with open_db(db) as conn:
|
||||
_seed_two_bot_scene(conn)
|
||||
project(conn)
|
||||
|
||||
client = MockLLMClient(canned=[host_canned, guest_canned])
|
||||
await apply_scene_close_summary(
|
||||
conn,
|
||||
client,
|
||||
classifier_model="x",
|
||||
chat_id="chat_bot_a",
|
||||
scene_id=1,
|
||||
host_bot_id="bot_a",
|
||||
)
|
||||
|
||||
# Both canned entries consumed -> classifier ran twice.
|
||||
assert client._canned == []
|
||||
|
||||
# Host memory carries the host's per-POV summary; guest memory
|
||||
# carries the guest's.
|
||||
host_pov = conn.execute(
|
||||
"SELECT pov_summary FROM memories "
|
||||
"WHERE owner_id = 'bot_a' AND scene_id = 1"
|
||||
).fetchone()[0]
|
||||
guest_pov = conn.execute(
|
||||
"SELECT pov_summary FROM memories "
|
||||
"WHERE owner_id = 'bot_b' AND scene_id = 1"
|
||||
).fetchone()[0]
|
||||
assert "BotA noticed" in host_pov
|
||||
assert "BotB found" in guest_pov
|
||||
assert host_pov != guest_pov
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_close_with_guest_updates_both_edges(tmp_path):
|
||||
"""Both bot->you edges receive their own relationship_summary on close."""
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
host_canned = json.dumps(
|
||||
{
|
||||
"summary": "BotA noticed BotB warming up.",
|
||||
"knowledge_facts": [],
|
||||
"relationship_summary": "BotA felt steady around you.",
|
||||
}
|
||||
)
|
||||
guest_canned = json.dumps(
|
||||
{
|
||||
"summary": "BotB warmed to the office.",
|
||||
"knowledge_facts": [],
|
||||
"relationship_summary": "BotB warmed up to you a little.",
|
||||
}
|
||||
)
|
||||
with open_db(db) as conn:
|
||||
_seed_two_bot_scene(conn)
|
||||
project(conn)
|
||||
|
||||
client = MockLLMClient(canned=[host_canned, guest_canned])
|
||||
await apply_scene_close_summary(
|
||||
conn,
|
||||
client,
|
||||
classifier_model="x",
|
||||
chat_id="chat_bot_a",
|
||||
scene_id=1,
|
||||
host_bot_id="bot_a",
|
||||
)
|
||||
|
||||
from chat.state.edges import get_edge
|
||||
|
||||
edge_h2y = get_edge(conn, "bot_a", "you")
|
||||
edge_g2y = get_edge(conn, "bot_b", "you")
|
||||
assert "steady" in edge_h2y["summary"]
|
||||
assert "warmed up" in edge_g2y["summary"]
|
||||
# Per-POV; the two edges did not collapse onto the same text.
|
||||
assert edge_h2y["summary"] != edge_g2y["summary"]
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_close_with_group_node_updates_group_summary(tmp_path):
|
||||
"""When a group_node row exists, scene close emits group_node_updated
|
||||
with a non-empty summary that mentions both bots' names (v2 naive
|
||||
concat of per-POV summaries)."""
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
import chat.state.group_node # noqa: F401 -- register handlers
|
||||
|
||||
host_canned = json.dumps(
|
||||
{
|
||||
"summary": "BotA appreciated the calm.",
|
||||
"knowledge_facts": [],
|
||||
"relationship_summary": "BotA felt steady.",
|
||||
}
|
||||
)
|
||||
guest_canned = json.dumps(
|
||||
{
|
||||
"summary": "BotB found the room friendly.",
|
||||
"knowledge_facts": [],
|
||||
"relationship_summary": "BotB warmed up.",
|
||||
}
|
||||
)
|
||||
with open_db(db) as conn:
|
||||
_seed_two_bot_scene(conn, with_group_node=True)
|
||||
project(conn)
|
||||
|
||||
client = MockLLMClient(canned=[host_canned, guest_canned])
|
||||
await apply_scene_close_summary(
|
||||
conn,
|
||||
client,
|
||||
classifier_model="x",
|
||||
chat_id="chat_bot_a",
|
||||
scene_id=1,
|
||||
host_bot_id="bot_a",
|
||||
)
|
||||
|
||||
from chat.state.group_node import get_group_node
|
||||
|
||||
gn = get_group_node(conn, "chat_bot_a")
|
||||
assert gn is not None
|
||||
assert gn["summary"] # non-empty
|
||||
# Naive concat surfaces both bot names in the group summary.
|
||||
assert "BotA" in gn["summary"]
|
||||
assert "BotB" in gn["summary"]
|
||||
# Phase 2 v2 keeps dynamic empty (Phase 3 polishes).
|
||||
assert gn["dynamic"] == ""
|
||||
|
||||
@@ -253,3 +253,244 @@ def test_must_exceeds_budget_hard_raises_value_error(tmp_path):
|
||||
budget_soft=5,
|
||||
budget_hard=10,
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Task 43: multi-entity prompt assembly (guest_id support)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _seed_with_guest(conn) -> None:
|
||||
"""Seed a 3-entity scene: you (Sam) + host (Aria, bot_a) + guest (Iris, bot_b).
|
||||
|
||||
Group node row is initialized with summary + dynamic, edges in all
|
||||
relevant directions are seeded, and activities are recorded for all
|
||||
three entities.
|
||||
"""
|
||||
append_event(conn, kind="bot_authored", payload={
|
||||
"id": "bot_a",
|
||||
"name": "Aria",
|
||||
"persona": "reserved coworker who notices things",
|
||||
"voice_samples": ["I — sorry, I didn't mean to.", "Right. Of course."],
|
||||
"traits": ["introverted", "observant"],
|
||||
"backstory": "An archivist who joined the firm last spring.",
|
||||
"initial_relationship_to_you": "coworker; mild crush; never voiced",
|
||||
"kickoff_prose": "you stay late at the office",
|
||||
})
|
||||
append_event(conn, kind="bot_authored", payload={
|
||||
"id": "bot_b",
|
||||
"name": "Iris",
|
||||
"persona": "wry transplant from the Boston office",
|
||||
"voice_samples": ["Oh, please.", "Don't make me say it twice."],
|
||||
"traits": ["sardonic", "loyal"],
|
||||
"backstory": "Met Aria at a conference two years back.",
|
||||
"initial_relationship_to_you": "stranger; curious",
|
||||
"kickoff_prose": "",
|
||||
})
|
||||
append_event(conn, kind="you_authored", payload={
|
||||
"name": "Sam",
|
||||
"pronouns": "they/them",
|
||||
"persona": "tired analyst",
|
||||
})
|
||||
append_event(conn, kind="chat_created", payload={
|
||||
"id": "chat_bot_a",
|
||||
"host_bot_id": "bot_a",
|
||||
"guest_bot_id": "bot_b",
|
||||
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||
"narrative_anchor": "Day 1 evening",
|
||||
"weather": "clear",
|
||||
})
|
||||
append_event(conn, kind="container_created", payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"name": "office bullpen",
|
||||
"type": "workplace",
|
||||
"properties": {"public": False, "moving": False, "audible_range": "room"},
|
||||
})
|
||||
# Edges: host -> you, guest -> you, host -> guest, guest -> host.
|
||||
append_event(conn, kind="edge_update", payload={
|
||||
"source_id": "bot_a",
|
||||
"target_id": "you",
|
||||
"affinity_delta": 12,
|
||||
"trust_delta": 5,
|
||||
"knowledge_facts": ["they work on the same floor"],
|
||||
})
|
||||
append_event(conn, kind="edge_update", payload={
|
||||
"source_id": "bot_a",
|
||||
"target_id": "bot_b",
|
||||
"affinity_delta": 20,
|
||||
"trust_delta": 15,
|
||||
"knowledge_facts": ["studied physics together"],
|
||||
})
|
||||
append_event(conn, kind="edge_update", payload={
|
||||
"source_id": "bot_b",
|
||||
"target_id": "you",
|
||||
"affinity_delta": 4,
|
||||
"trust_delta": 0,
|
||||
"knowledge_facts": ["Aria's coworker"],
|
||||
})
|
||||
append_event(conn, kind="edge_update", payload={
|
||||
"source_id": "bot_b",
|
||||
"target_id": "bot_a",
|
||||
"affinity_delta": 18,
|
||||
"trust_delta": 12,
|
||||
"knowledge_facts": ["former roommate"],
|
||||
})
|
||||
# Activity for all three entities — note distinct verbs so we can
|
||||
# check whose activity got dropped under tight budget.
|
||||
append_event(conn, kind="activity_change", payload={
|
||||
"entity_id": "you",
|
||||
"container_id": 1,
|
||||
"posture": "sitting at your desk",
|
||||
"action": {"verb": "finishing emails"},
|
||||
"attention": "the screen",
|
||||
"holding": ["coffee mug"],
|
||||
})
|
||||
append_event(conn, kind="activity_change", payload={
|
||||
"entity_id": "bot_a",
|
||||
"container_id": 1,
|
||||
"posture": "sitting at her desk",
|
||||
"action": {"verb": "pretending to work"},
|
||||
"attention": "you, in glances",
|
||||
})
|
||||
append_event(conn, kind="activity_change", payload={
|
||||
"entity_id": "bot_b",
|
||||
"container_id": 1,
|
||||
"posture": "leaning against the doorframe",
|
||||
"action": {"verb": "smirking-distinctively"},
|
||||
"attention": "Aria",
|
||||
})
|
||||
append_event(conn, kind="scene_opened", payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"container_id": 1,
|
||||
"started_at": "2026-04-26T20:00:00+00:00",
|
||||
"participants": ["you", "bot_a", "bot_b"],
|
||||
})
|
||||
append_event(conn, kind="group_node_initialized", payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"members": ["you", "bot_a", "bot_b"],
|
||||
"summary": "Three coworkers catching up after hours UNIQUE-GROUP-SUMMARY.",
|
||||
"dynamic": "warm-but-prickly UNIQUE-GROUP-DYNAMIC",
|
||||
})
|
||||
project(conn)
|
||||
|
||||
|
||||
def test_assemble_with_no_guest_matches_phase1(tmp_path):
|
||||
"""Regression: 2-entity scenario without guest_id behaves exactly as Phase 1."""
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
_seed_basic(conn)
|
||||
msgs = assemble_narrative_prompt(
|
||||
conn,
|
||||
chat_id="chat_bot_a",
|
||||
speaker_bot_id="bot_a",
|
||||
recent_dialogue=[],
|
||||
retrieved_memory_summaries=[],
|
||||
)
|
||||
body = msgs[0].content
|
||||
# Phase 1 must blocks present.
|
||||
assert "Aria" in body
|
||||
assert "PERSONA" in body
|
||||
assert "Sam" in body
|
||||
assert "ACTIVITIES" in body
|
||||
assert "62/100" in body # speaker → addressee edge intact
|
||||
# No guest content leaks in.
|
||||
assert "Group dynamic" not in body
|
||||
assert "Iris" not in body
|
||||
|
||||
|
||||
def test_assemble_with_guest_includes_group_node_summary(tmp_path):
|
||||
"""When guest is present (auto-detected via chat.guest_bot_id) and a
|
||||
group_node row exists, its summary + dynamic are rendered."""
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
_seed_with_guest(conn)
|
||||
msgs = assemble_narrative_prompt(
|
||||
conn,
|
||||
chat_id="chat_bot_a",
|
||||
speaker_bot_id="bot_a",
|
||||
recent_dialogue=[],
|
||||
retrieved_memory_summaries=[],
|
||||
)
|
||||
body = msgs[0].content
|
||||
assert "Group dynamic" in body
|
||||
assert "UNIQUE-GROUP-SUMMARY" in body
|
||||
assert "UNIQUE-GROUP-DYNAMIC" in body
|
||||
# Guest activity also present (SHOULD-tier, fits at default budget).
|
||||
assert "smirking-distinctively" in body
|
||||
# Speaker's other edges include the host -> guest direction.
|
||||
assert "Iris" in body
|
||||
|
||||
|
||||
def test_assemble_when_speaker_is_guest_orients_edges_correctly(tmp_path):
|
||||
"""When the guest is the speaker, identity is the guest, the
|
||||
addressee edge is guest → you, and other edges include guest → host."""
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
_seed_with_guest(conn)
|
||||
msgs = assemble_narrative_prompt(
|
||||
conn,
|
||||
chat_id="chat_bot_a",
|
||||
speaker_bot_id="bot_b", # guest as speaker
|
||||
recent_dialogue=[],
|
||||
retrieved_memory_summaries=[],
|
||||
)
|
||||
body = msgs[0].content
|
||||
# Speaker identity is the guest's persona.
|
||||
assert "You are Iris." in body
|
||||
assert "wry transplant from the Boston office" in body
|
||||
# Edge to addressee is guest → you (Sam) with the seeded values
|
||||
# (default 50 + 4 affinity = 54).
|
||||
assert "YOUR EDGE TO Sam" in body
|
||||
assert "54/100" in body
|
||||
# Other edges include guest → host (Aria) with seeded value
|
||||
# (default 50 + 18 = 68).
|
||||
assert "OTHER EDGES" in body
|
||||
assert "Aria" in body
|
||||
assert "68/100" in body
|
||||
|
||||
|
||||
def test_assemble_with_tight_budget_drops_guest_activity_first(tmp_path):
|
||||
"""Under tight budget MUST blocks survive but SHOULD-tier guest
|
||||
activity is dropped first."""
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
_seed_with_guest(conn)
|
||||
# Short dialogue so MUST core (speaker identity + edge + last 4
|
||||
# turns + closing) sits comfortably under the hard budget while
|
||||
# SHOULD-tier additions (guest activity, group node, other edges)
|
||||
# would push over.
|
||||
dialogue = [
|
||||
{"speaker": "you", "text": "line-16 hi there"},
|
||||
{"speaker": "bot_a", "text": "line-17 hey"},
|
||||
{"speaker": "you", "text": "line-18 quiet night"},
|
||||
{"speaker": "bot_a", "text": "line-19 indeed"},
|
||||
]
|
||||
msgs = assemble_narrative_prompt(
|
||||
conn,
|
||||
chat_id="chat_bot_a",
|
||||
speaker_bot_id="bot_a",
|
||||
recent_dialogue=dialogue,
|
||||
retrieved_memory_summaries=[],
|
||||
# MUST core ~310 tokens; SHOULD additions (guest activity +
|
||||
# group node + other edges) push it well over 380. budget_hard
|
||||
# is set just above MUST core so SHOULD-tier blocks must be
|
||||
# trimmed away.
|
||||
budget_soft=250,
|
||||
budget_hard=340,
|
||||
)
|
||||
body = msgs[0].content
|
||||
# MUST: speaker identity, edge to addressee, last 4 dialogue turns.
|
||||
assert "Aria" in body
|
||||
assert "YOUR EDGE TO Sam" in body
|
||||
for i in range(16, 20):
|
||||
assert f"line-{i:02d}" in body
|
||||
# Guest activity (SHOULD-tier) must be dropped under tight budget.
|
||||
assert "smirking-distinctively" not in body
|
||||
# Token budget honoured.
|
||||
import tiktoken
|
||||
enc = tiktoken.get_encoding("cl100k_base")
|
||||
assert len(enc.encode(body)) <= 340
|
||||
|
||||
@@ -0,0 +1,109 @@
|
||||
"""Tests for the relationship-seed service (T38).
|
||||
|
||||
Per Requirements §5.2, when two bots first co-appear in a chat, the user
|
||||
is prompted with "Have they met before? If yes, write a short prose
|
||||
seed." The prose is parsed via classifier into structured directed-edge
|
||||
content for the ``botA -> botB`` and ``botB -> botA`` edges.
|
||||
|
||||
These tests cover:
|
||||
|
||||
* The happy path: a canned classifier response parses cleanly into a
|
||||
populated :class:`RelationshipSeed` with both directions filled.
|
||||
* Empty prose short-circuits before any classifier call (mock has no
|
||||
canned responses; an accidental call would raise ``IndexError``).
|
||||
* Whitespace-only prose has the same short-circuit behavior.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
|
||||
import pytest
|
||||
|
||||
from chat.llm.mock import MockLLMClient
|
||||
from chat.services.relationship_seed import (
|
||||
RelationshipSeed,
|
||||
seed_inter_bot_edges,
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_seed_parses_canned_prose():
|
||||
canned = json.dumps(
|
||||
{
|
||||
"a_to_b_summary": "old college friend who now distrusts him slightly",
|
||||
"a_to_b_knowledge_facts": [
|
||||
"studied physics together",
|
||||
"lost touch after a falling out",
|
||||
],
|
||||
"a_to_b_affinity_delta": 2,
|
||||
"a_to_b_trust_delta": -1,
|
||||
"b_to_a_summary": "former roommate; warm memories, mild resentment",
|
||||
"b_to_a_knowledge_facts": ["lived together junior year"],
|
||||
"b_to_a_affinity_delta": 3,
|
||||
"b_to_a_trust_delta": 0,
|
||||
}
|
||||
)
|
||||
mock = MockLLMClient(canned=[canned])
|
||||
result = await seed_inter_bot_edges(
|
||||
mock,
|
||||
classifier_model="x",
|
||||
bot_a_id="bot_a",
|
||||
bot_a_name="Alice",
|
||||
bot_b_id="bot_b",
|
||||
bot_b_name="Bob",
|
||||
relationship_prose=(
|
||||
"Alice and Bob met in college. They studied physics together and "
|
||||
"lived as roommates junior year, but drifted apart after a fight."
|
||||
),
|
||||
)
|
||||
assert isinstance(result, RelationshipSeed)
|
||||
assert (
|
||||
result.a_to_b_summary
|
||||
== "old college friend who now distrusts him slightly"
|
||||
)
|
||||
assert result.a_to_b_knowledge_facts == [
|
||||
"studied physics together",
|
||||
"lost touch after a falling out",
|
||||
]
|
||||
assert result.a_to_b_affinity_delta == 2
|
||||
assert result.a_to_b_trust_delta == -1
|
||||
assert (
|
||||
result.b_to_a_summary
|
||||
== "former roommate; warm memories, mild resentment"
|
||||
)
|
||||
assert result.b_to_a_knowledge_facts == ["lived together junior year"]
|
||||
assert result.b_to_a_affinity_delta == 3
|
||||
assert result.b_to_a_trust_delta == 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_seed_empty_prose_returns_empty():
|
||||
"""Empty prose short-circuits — classifier must not be called."""
|
||||
mock = MockLLMClient(canned=[])
|
||||
result = await seed_inter_bot_edges(
|
||||
mock,
|
||||
classifier_model="x",
|
||||
bot_a_id="bot_a",
|
||||
bot_a_name="Alice",
|
||||
bot_b_id="bot_b",
|
||||
bot_b_name="Bob",
|
||||
relationship_prose="",
|
||||
)
|
||||
assert result == RelationshipSeed()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_seed_whitespace_only_prose_returns_empty():
|
||||
"""Whitespace-only prose is treated the same as empty."""
|
||||
mock = MockLLMClient(canned=[])
|
||||
result = await seed_inter_bot_edges(
|
||||
mock,
|
||||
classifier_model="x",
|
||||
bot_a_id="bot_a",
|
||||
bot_a_name="Alice",
|
||||
bot_b_id="bot_b",
|
||||
bot_b_name="Bob",
|
||||
relationship_prose=" \n ",
|
||||
)
|
||||
assert result == RelationshipSeed()
|
||||
@@ -183,3 +183,167 @@ def test_bot_list_renders_reset_form(client, tmp_path):
|
||||
assert response.status_code == 200
|
||||
assert "Reset" in response.text
|
||||
assert "confirm_name" in response.text
|
||||
|
||||
|
||||
def _seed_two_bots_with_guest_link(
|
||||
db: Path, *, extra_events: list[dict] | None = None
|
||||
) -> None:
|
||||
"""Seed bot_a + bot_b, each hosting their own chat, with bot_b a guest in chat_bot_a.
|
||||
|
||||
``extra_events`` is appended after the guest_added event and projected
|
||||
together with the rest of the seed (so handlers run only once per event).
|
||||
"""
|
||||
with open_db(db) as conn:
|
||||
# bot_a + its chat
|
||||
append_event(
|
||||
conn,
|
||||
kind="bot_authored",
|
||||
payload={
|
||||
"id": "bot_a",
|
||||
"name": "BotA",
|
||||
"persona": "thoughtful",
|
||||
"voice_samples": [],
|
||||
"traits": [],
|
||||
"backstory": "",
|
||||
"initial_relationship_to_you": "coworker",
|
||||
"kickoff_prose": "",
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="chat_created",
|
||||
payload={
|
||||
"id": "chat_bot_a",
|
||||
"host_bot_id": "bot_a",
|
||||
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||
"narrative_anchor": "Day 1",
|
||||
"weather": "",
|
||||
},
|
||||
)
|
||||
# bot_b + its own chat
|
||||
append_event(
|
||||
conn,
|
||||
kind="bot_authored",
|
||||
payload={
|
||||
"id": "bot_b",
|
||||
"name": "BotB",
|
||||
"persona": "curious",
|
||||
"voice_samples": [],
|
||||
"traits": [],
|
||||
"backstory": "",
|
||||
"initial_relationship_to_you": "friend",
|
||||
"kickoff_prose": "",
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="chat_created",
|
||||
payload={
|
||||
"id": "chat_bot_b",
|
||||
"host_bot_id": "bot_b",
|
||||
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||
"narrative_anchor": "Day 1",
|
||||
"weather": "",
|
||||
},
|
||||
)
|
||||
# bot_b joins chat_bot_a as a guest.
|
||||
append_event(
|
||||
conn,
|
||||
kind="guest_added",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"guest_bot_id": "bot_b",
|
||||
},
|
||||
)
|
||||
for ev in extra_events or []:
|
||||
append_event(conn, kind=ev["kind"], payload=ev["payload"])
|
||||
project(conn)
|
||||
|
||||
|
||||
def test_reset_clears_guest_reference_in_other_chats(client, tmp_path):
|
||||
db = tmp_path / "test.db"
|
||||
_seed_two_bots_with_guest_link(db)
|
||||
|
||||
# Sanity-check the seed: bot_b is the guest in bot_a's chat.
|
||||
from chat.state.world import get_chat
|
||||
with open_db(db) as conn:
|
||||
assert get_chat(conn, "chat_bot_a")["guest_bot_id"] == "bot_b"
|
||||
assert get_chat(conn, "chat_bot_b") is not None
|
||||
|
||||
response = client.post(
|
||||
"/bots/bot_b/reset",
|
||||
data={"confirm_name": "BotB"},
|
||||
follow_redirects=False,
|
||||
)
|
||||
assert response.status_code == 303
|
||||
|
||||
with open_db(db) as conn:
|
||||
# The guest reference in bot_a's chat is cleared.
|
||||
chat_a = get_chat(conn, "chat_bot_a")
|
||||
assert chat_a is not None
|
||||
assert chat_a["guest_bot_id"] is None
|
||||
|
||||
# bot_b's own chat is gone (Phase 1 host purge behavior).
|
||||
assert get_chat(conn, "chat_bot_b") is None
|
||||
|
||||
# bot_a is untouched.
|
||||
assert conn.execute(
|
||||
"SELECT COUNT(*) FROM bots WHERE id = 'bot_a'"
|
||||
).fetchone()[0] == 1
|
||||
|
||||
|
||||
def test_reset_purges_guest_memories_from_other_chats(client, tmp_path):
|
||||
db = tmp_path / "test.db"
|
||||
_seed_two_bots_with_guest_link(
|
||||
db,
|
||||
extra_events=[
|
||||
# bot_b is a guest in chat_bot_a and remembers things from there.
|
||||
{
|
||||
"kind": "memory_written",
|
||||
"payload": {
|
||||
"owner_id": "bot_b",
|
||||
"chat_id": "chat_bot_a",
|
||||
"pov_summary": "Met BotA; she was tense.",
|
||||
"witness_you": 1,
|
||||
"witness_host": 1,
|
||||
"witness_guest": 1,
|
||||
"significance": 3,
|
||||
},
|
||||
},
|
||||
# And a memory from bot_b's own chat for good measure.
|
||||
{
|
||||
"kind": "memory_written",
|
||||
"payload": {
|
||||
"owner_id": "bot_b",
|
||||
"chat_id": "chat_bot_b",
|
||||
"pov_summary": "A quiet evening at home.",
|
||||
"witness_you": 1,
|
||||
"witness_host": 1,
|
||||
"witness_guest": 0,
|
||||
"significance": 1,
|
||||
},
|
||||
},
|
||||
],
|
||||
)
|
||||
|
||||
with open_db(db) as conn:
|
||||
# Sanity: bot_b owns 2 memories pre-reset, one in each chat.
|
||||
assert conn.execute(
|
||||
"SELECT COUNT(*) FROM memories WHERE owner_id = 'bot_b'"
|
||||
).fetchone()[0] == 2
|
||||
|
||||
response = client.post(
|
||||
"/bots/bot_b/reset",
|
||||
data={"confirm_name": "BotB"},
|
||||
follow_redirects=False,
|
||||
)
|
||||
assert response.status_code == 303
|
||||
|
||||
with open_db(db) as conn:
|
||||
# ALL of bot_b's memories are gone, including the cross-chat one in chat_bot_a.
|
||||
assert conn.execute(
|
||||
"SELECT COUNT(*) FROM memories WHERE owner_id = 'bot_b'"
|
||||
).fetchone()[0] == 0
|
||||
assert conn.execute(
|
||||
"SELECT COUNT(*) FROM memories WHERE owner_id = 'bot_b' AND chat_id = 'chat_bot_a'"
|
||||
).fetchone()[0] == 0
|
||||
|
||||
@@ -202,3 +202,487 @@ def test_get_chat_renders_existing_turns(client, tmp_path):
|
||||
body = response.text
|
||||
assert "hello" in body
|
||||
assert "Hi there." in body
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 2 (T44) — multi-entity turn flow.
|
||||
#
|
||||
# These tests cover the post_turn flow when a guest is present: addressee
|
||||
# detection, multi-pair state-update + multi-witness memory writes, and
|
||||
# the optional interjection follow-on. Each test installs its own
|
||||
# MockLLMClient with a canned-response queue tailored to the call shape
|
||||
# of that scenario; the queue is documented at the top of each test so
|
||||
# the orchestration is auditable.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _bot_payload(bot_id: str, name: str, persona: str = "") -> dict:
|
||||
return {
|
||||
"id": bot_id,
|
||||
"name": name,
|
||||
"persona": persona or f"persona for {name}",
|
||||
"voice_samples": [],
|
||||
"traits": [],
|
||||
"backstory": "",
|
||||
"initial_relationship_to_you": "",
|
||||
"kickoff_prose": "...",
|
||||
}
|
||||
|
||||
|
||||
def _seed_chat_with_guest(db_path: Path) -> None:
|
||||
"""Author host BotA + guest BotB, create a chat with both wired in,
|
||||
and seed an open scene plus minimal activity rows so the prompt
|
||||
assembler sees a third party. Edges are seeded for all six directed
|
||||
pairs at the schema-default 50/50 baseline so multi-pair state
|
||||
updates land cleanly."""
|
||||
with open_db(db_path) as conn:
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_a", "BotA"))
|
||||
append_event(conn, kind="bot_authored", payload=_bot_payload("bot_b", "BotB"))
|
||||
append_event(
|
||||
conn,
|
||||
kind="you_authored",
|
||||
payload={"name": "Me", "pronouns": "they/them", "persona": ""},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="chat_created",
|
||||
payload={
|
||||
"id": "chat_bot_a",
|
||||
"host_bot_id": "bot_a",
|
||||
"guest_bot_id": "bot_b",
|
||||
"initial_time": "2026-04-26T20:00:00+00:00",
|
||||
"narrative_anchor": "Day 1",
|
||||
"weather": "",
|
||||
},
|
||||
)
|
||||
# Container + open scene so scene_close detection has something
|
||||
# to act on in the per-POV summary test.
|
||||
append_event(
|
||||
conn,
|
||||
kind="container_created",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"name": "office",
|
||||
"type": "workplace",
|
||||
"properties": {},
|
||||
},
|
||||
)
|
||||
append_event(
|
||||
conn,
|
||||
kind="scene_opened",
|
||||
payload={
|
||||
"chat_id": "chat_bot_a",
|
||||
"container_id": 1,
|
||||
"started_at": "2026-04-26T20:00:00+00:00",
|
||||
"participants": ["you", "bot_a", "bot_b"],
|
||||
},
|
||||
)
|
||||
# Seed all six directed edges so state-update writes land on
|
||||
# initialized rows. Knowledge fact on bot_a -> you exercises
|
||||
# the existing-fact preservation path.
|
||||
for src, tgt, facts in [
|
||||
("bot_a", "you", ["coworker"]),
|
||||
("you", "bot_a", []),
|
||||
("bot_b", "you", []),
|
||||
("you", "bot_b", []),
|
||||
("bot_a", "bot_b", []),
|
||||
("bot_b", "bot_a", []),
|
||||
]:
|
||||
append_event(
|
||||
conn,
|
||||
kind="edge_update",
|
||||
payload={
|
||||
"source_id": src,
|
||||
"target_id": tgt,
|
||||
"chat_id": "chat_bot_a",
|
||||
"knowledge_facts": facts,
|
||||
},
|
||||
)
|
||||
for entity_id, verb in [
|
||||
("you", "talking"),
|
||||
("bot_a", "listening"),
|
||||
("bot_b", "listening"),
|
||||
]:
|
||||
append_event(
|
||||
conn,
|
||||
kind="activity_change",
|
||||
payload={
|
||||
"entity_id": entity_id,
|
||||
"posture": "sitting",
|
||||
"action": {
|
||||
"verb": verb,
|
||||
"interruptible": True,
|
||||
"required_attention": "low",
|
||||
"expected_duration": "ongoing",
|
||||
},
|
||||
"attention": "",
|
||||
"holding": [],
|
||||
"status": {},
|
||||
},
|
||||
)
|
||||
project(conn)
|
||||
|
||||
|
||||
def _override_llm(canned: list[str]) -> MockLLMClient:
|
||||
"""Wire a fresh ``MockLLMClient`` and return it so tests can introspect
|
||||
the residual canned queue after the request."""
|
||||
from chat.web.kickoff import get_llm_client
|
||||
|
||||
mock = MockLLMClient(canned=list(canned))
|
||||
app.dependency_overrides[get_llm_client] = lambda: mock
|
||||
return mock
|
||||
|
||||
|
||||
def _zero_state() -> str:
|
||||
return json.dumps(
|
||||
{"affinity_delta": 0, "trust_delta": 0, "knowledge_facts": []}
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def app_state_setup(tmp_path, monkeypatch):
|
||||
"""Same env wiring as the existing ``client`` fixture but without a
|
||||
pre-installed MockLLMClient — the multi-entity tests pin their own
|
||||
canned queues per scenario.
|
||||
"""
|
||||
cfg = tmp_path / "config.toml"
|
||||
cfg.write_text('featherless_api_key = "test"\n')
|
||||
monkeypatch.setenv("CHAT_CONFIG_PATH", str(cfg))
|
||||
db = tmp_path / "test.db"
|
||||
monkeypatch.setenv("CHAT_DB_PATH", str(db))
|
||||
with TestClient(app) as c:
|
||||
app.state.background_worker.enabled = False
|
||||
yield c
|
||||
app.dependency_overrides.clear()
|
||||
|
||||
|
||||
def test_single_bot_turn_no_guest_regression(app_state_setup, tmp_path):
|
||||
"""No-guest regression: the canned-response queue remains parse +
|
||||
narrative + 2 state-updates. Interjection is path-bypassed because
|
||||
the chat has no guest, so ``detect_interjection`` is NOT invoked.
|
||||
Ends with one user_turn, one assistant_turn, two edge_updates, and a
|
||||
single ``memory_written``.
|
||||
"""
|
||||
_seed(tmp_path / "test.db")
|
||||
canned_parse = json.dumps(
|
||||
{"segments": [{"kind": "dialogue", "text": "hello"}]}
|
||||
)
|
||||
mock = _override_llm(
|
||||
[canned_parse, "Hi there.", _zero_state(), _zero_state()]
|
||||
)
|
||||
try:
|
||||
response = app_state_setup.post(
|
||||
"/chats/chat_bot_a/turns", data={"prose": "hello"}
|
||||
)
|
||||
assert response.status_code == 204
|
||||
finally:
|
||||
app.dependency_overrides.clear()
|
||||
|
||||
# No guest -> no interjection classifier call -> queue fully drained.
|
||||
assert mock._canned == []
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
cur = conn.execute(
|
||||
"SELECT kind FROM event_log "
|
||||
"WHERE kind IN ('user_turn', 'assistant_turn', 'edge_update', "
|
||||
" 'memory_written') ORDER BY id"
|
||||
)
|
||||
kinds = [r[0] for r in cur.fetchall()]
|
||||
user_turns = [k for k in kinds if k == "user_turn"]
|
||||
assistant_turns = [k for k in kinds if k == "assistant_turn"]
|
||||
edge_updates_after_seed = [k for k in kinds if k == "edge_update"]
|
||||
memory_writes = [k for k in kinds if k == "memory_written"]
|
||||
assert len(user_turns) == 1
|
||||
assert len(assistant_turns) == 1
|
||||
# Seed adds exactly one edge_update (bot_a -> you); the post-turn
|
||||
# pass adds two more for a total of three.
|
||||
assert len(edge_updates_after_seed) == 3
|
||||
assert len(memory_writes) == 1
|
||||
|
||||
|
||||
def test_multi_bot_turn_no_interjection(app_state_setup, tmp_path):
|
||||
"""Chat has a guest; ``detect_interjection`` returns False. Verify:
|
||||
1 user_turn + 1 assistant_turn + 6 *post-turn* edge_updates + 2
|
||||
memory_written events. Single turn_html broadcast.
|
||||
|
||||
Canned queue (8 calls):
|
||||
1. parse_turn
|
||||
2. narrative stream (primary, addressee = host because the prose
|
||||
doesn't name the guest)
|
||||
3-8. 6 state-update calls (one per directed pair across {you,
|
||||
bot_a, bot_b})
|
||||
9. detect_interjection -> should_interject=False
|
||||
10. detect_scene_close -> should_close=False
|
||||
"""
|
||||
_seed_chat_with_guest(tmp_path / "test.db")
|
||||
canned_parse = json.dumps(
|
||||
{"segments": [{"kind": "dialogue", "text": "hello room"}]}
|
||||
)
|
||||
canned = [
|
||||
canned_parse,
|
||||
"Greetings.",
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
json.dumps({"should_interject": False, "reason": "calm"}),
|
||||
json.dumps({"should_close": False, "reason": "no signal"}),
|
||||
]
|
||||
mock = _override_llm(canned)
|
||||
try:
|
||||
response = app_state_setup.post(
|
||||
"/chats/chat_bot_a/turns", data={"prose": "hello room"}
|
||||
)
|
||||
assert response.status_code == 204
|
||||
finally:
|
||||
app.dependency_overrides.clear()
|
||||
# All 10 canned slots should have been consumed.
|
||||
assert mock._canned == []
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
# Count post-turn edge_updates (i.e. those after the latest
|
||||
# assistant_turn id).
|
||||
max_at = conn.execute(
|
||||
"SELECT MAX(id) FROM event_log WHERE kind = 'assistant_turn'"
|
||||
).fetchone()[0]
|
||||
cur = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log "
|
||||
"WHERE kind = 'edge_update' AND id > ?",
|
||||
(max_at,),
|
||||
)
|
||||
post_turn_edge_updates = cur.fetchone()[0]
|
||||
|
||||
cur = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log WHERE kind = 'user_turn'"
|
||||
)
|
||||
user_turn_count = cur.fetchone()[0]
|
||||
cur = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log WHERE kind = 'assistant_turn'"
|
||||
)
|
||||
assistant_turn_count = cur.fetchone()[0]
|
||||
cur = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log WHERE kind = 'memory_written'"
|
||||
)
|
||||
memory_count = cur.fetchone()[0]
|
||||
|
||||
assert user_turn_count == 1
|
||||
assert assistant_turn_count == 1
|
||||
assert post_turn_edge_updates == 6
|
||||
assert memory_count == 2
|
||||
|
||||
|
||||
def test_multi_bot_turn_with_interjection(app_state_setup, tmp_path):
|
||||
"""Chat has a guest; ``detect_interjection`` returns True. Verify:
|
||||
1 user_turn + 2 assistant_turns + (6 + 6) post-turn edge_updates +
|
||||
4 memory_written events.
|
||||
|
||||
Canned queue (16 calls):
|
||||
1. parse_turn
|
||||
2. narrative stream (primary)
|
||||
3-8. 6 state-update calls (post-primary)
|
||||
9. detect_interjection -> should_interject=True
|
||||
10. narrative stream (interjection)
|
||||
11-16. 6 state-update calls (post-interjection)
|
||||
17. detect_scene_close -> should_close=False
|
||||
"""
|
||||
_seed_chat_with_guest(tmp_path / "test.db")
|
||||
canned_parse = json.dumps(
|
||||
{"segments": [{"kind": "dialogue", "text": "tell me"}]}
|
||||
)
|
||||
canned = [
|
||||
canned_parse,
|
||||
"Primary beat.",
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
json.dumps({"should_interject": True, "reason": "jealous"}),
|
||||
"Interjection beat!",
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
json.dumps({"should_close": False, "reason": "no signal"}),
|
||||
]
|
||||
mock = _override_llm(canned)
|
||||
try:
|
||||
response = app_state_setup.post(
|
||||
"/chats/chat_bot_a/turns", data={"prose": "tell me"}
|
||||
)
|
||||
assert response.status_code == 204
|
||||
finally:
|
||||
app.dependency_overrides.clear()
|
||||
assert mock._canned == []
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
cur = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log WHERE kind = 'assistant_turn'"
|
||||
)
|
||||
assistant_count = cur.fetchone()[0]
|
||||
cur = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log WHERE kind = 'memory_written'"
|
||||
)
|
||||
memory_count = cur.fetchone()[0]
|
||||
# All edge_updates after the FIRST assistant_turn are post-turn.
|
||||
first_at = conn.execute(
|
||||
"SELECT MIN(id) FROM event_log WHERE kind = 'assistant_turn'"
|
||||
).fetchone()[0]
|
||||
post_turn_edges = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log "
|
||||
"WHERE kind = 'edge_update' AND id > ?",
|
||||
(first_at,),
|
||||
).fetchone()[0]
|
||||
|
||||
# Both assistant_turn payloads should reference the same user_turn
|
||||
# and the second one tags ``interjection_of`` the first speaker.
|
||||
rows = conn.execute(
|
||||
"SELECT payload_json FROM event_log "
|
||||
"WHERE kind = 'assistant_turn' ORDER BY id"
|
||||
).fetchall()
|
||||
first_payload = json.loads(rows[0][0])
|
||||
second_payload = json.loads(rows[1][0])
|
||||
|
||||
assert assistant_count == 2
|
||||
assert memory_count == 4
|
||||
assert post_turn_edges == 12
|
||||
assert first_payload["text"] == "Primary beat."
|
||||
assert second_payload["text"] == "Interjection beat!"
|
||||
# The silent witness is the bot that wasn't the primary addressee.
|
||||
assert second_payload["interjection_of"] == first_payload["speaker_id"]
|
||||
assert second_payload["speaker_id"] != first_payload["speaker_id"]
|
||||
assert first_payload["user_turn_id"] == second_payload["user_turn_id"]
|
||||
|
||||
|
||||
def test_multi_bot_turn_scene_close_writes_per_pov_summaries(
|
||||
app_state_setup, tmp_path
|
||||
):
|
||||
"""Chat has a guest, prose hard-signals a scene close, classifier
|
||||
confirms. Verify a ``scene_closed`` event lands and per-POV summary
|
||||
rewrites fire for both bots (memory.pov_summary changes for each).
|
||||
Interjection short-circuits at False so the queue stays compact.
|
||||
|
||||
Canned queue (12 calls):
|
||||
1. parse_turn
|
||||
2. narrative stream (primary)
|
||||
3-8. 6 state-update calls
|
||||
9. detect_interjection -> False (no follow-on stream)
|
||||
10. detect_scene_close -> True
|
||||
11. apply_scene_close_summary host POV
|
||||
12. apply_scene_close_summary guest POV
|
||||
"""
|
||||
_seed_chat_with_guest(tmp_path / "test.db")
|
||||
canned_parse = json.dumps(
|
||||
{
|
||||
"segments": [
|
||||
{"kind": "narration", "text": "we are done here, fade out"}
|
||||
]
|
||||
}
|
||||
)
|
||||
pov_payload = json.dumps(
|
||||
{
|
||||
"summary": "BotA noticed the day winding down.",
|
||||
"knowledge_facts": [],
|
||||
"relationship_summary": "warmer",
|
||||
}
|
||||
)
|
||||
pov_payload_guest = json.dumps(
|
||||
{
|
||||
"summary": "BotB watched the scene close.",
|
||||
"knowledge_facts": [],
|
||||
"relationship_summary": "warmer",
|
||||
}
|
||||
)
|
||||
canned = [
|
||||
canned_parse,
|
||||
"Goodnight.",
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
json.dumps({"should_interject": False, "reason": "calm"}),
|
||||
json.dumps({"should_close": True, "reason": "fade out signaled"}),
|
||||
pov_payload,
|
||||
pov_payload_guest,
|
||||
]
|
||||
mock = _override_llm(canned)
|
||||
try:
|
||||
response = app_state_setup.post(
|
||||
"/chats/chat_bot_a/turns", data={"prose": "we are done here, fade out"}
|
||||
)
|
||||
assert response.status_code == 204
|
||||
finally:
|
||||
app.dependency_overrides.clear()
|
||||
assert mock._canned == []
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
cur = conn.execute(
|
||||
"SELECT COUNT(*) FROM event_log WHERE kind = 'scene_closed'"
|
||||
)
|
||||
scene_close_count = cur.fetchone()[0]
|
||||
# One memory_pov_summary manual_edit per witness.
|
||||
cur = conn.execute(
|
||||
"SELECT payload_json FROM event_log WHERE kind = 'manual_edit'"
|
||||
)
|
||||
manual_edits = [json.loads(r[0]) for r in cur.fetchall()]
|
||||
pov_edits = [
|
||||
e for e in manual_edits
|
||||
if e.get("target_kind") == "memory_pov_summary"
|
||||
]
|
||||
# After the rewrite, bot_a's scene-1 memory carries the host POV
|
||||
# and bot_b's scene-1 memory carries the guest POV.
|
||||
host_pov = conn.execute(
|
||||
"SELECT pov_summary FROM memories WHERE owner_id = ? AND scene_id = 1",
|
||||
("bot_a",),
|
||||
).fetchone()
|
||||
guest_pov = conn.execute(
|
||||
"SELECT pov_summary FROM memories WHERE owner_id = ? AND scene_id = 1",
|
||||
("bot_b",),
|
||||
).fetchone()
|
||||
|
||||
assert scene_close_count == 1
|
||||
# Two memory rewrites — one per witness.
|
||||
assert len(pov_edits) == 2
|
||||
assert host_pov is not None and "BotA noticed" in host_pov[0]
|
||||
assert guest_pov is not None and "BotB watched" in guest_pov[0]
|
||||
|
||||
|
||||
def test_addressee_detection_routes_to_named_bot(app_state_setup, tmp_path):
|
||||
"""Prose that names the guest by name routes the primary turn to the
|
||||
guest. Interjection (when fired) makes the host the silent witness
|
||||
and the second assistant_turn carries the host as speaker.
|
||||
|
||||
Canned queue: same shape as the with-interjection test (16 calls)
|
||||
plus the trailing scene_close decision.
|
||||
"""
|
||||
_seed_chat_with_guest(tmp_path / "test.db")
|
||||
canned_parse = json.dumps(
|
||||
{"segments": [{"kind": "dialogue", "text": "BotB, what do you think?"}]}
|
||||
)
|
||||
canned = [
|
||||
canned_parse,
|
||||
"BotB pondering.",
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
json.dumps({"should_interject": True, "reason": "host wants in"}),
|
||||
"BotA chiming in.",
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
_zero_state(), _zero_state(), _zero_state(),
|
||||
json.dumps({"should_close": False, "reason": "no signal"}),
|
||||
]
|
||||
mock = _override_llm(canned)
|
||||
try:
|
||||
response = app_state_setup.post(
|
||||
"/chats/chat_bot_a/turns",
|
||||
data={"prose": "BotB, what do you think?"},
|
||||
)
|
||||
assert response.status_code == 204
|
||||
finally:
|
||||
app.dependency_overrides.clear()
|
||||
assert mock._canned == []
|
||||
|
||||
with open_db(tmp_path / "test.db") as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT payload_json FROM event_log "
|
||||
"WHERE kind = 'assistant_turn' ORDER BY id"
|
||||
).fetchall()
|
||||
primary_payload = json.loads(rows[0][0])
|
||||
interjection_payload = json.loads(rows[1][0])
|
||||
|
||||
# Primary speaker is the guest because the prose names BotB and not
|
||||
# BotA (case-insensitive whole-word match).
|
||||
assert primary_payload["speaker_id"] == "bot_b"
|
||||
# Interjection follow-on goes to the silent witness — the host.
|
||||
assert interjection_payload["speaker_id"] == "bot_a"
|
||||
assert interjection_payload["interjection_of"] == "bot_b"
|
||||
|
||||
@@ -0,0 +1,269 @@
|
||||
"""Task 46 — Witness filter coverage for multi-entity scenarios.
|
||||
|
||||
The witness filter is enforced at the SQL layer in
|
||||
``chat.state.memory.search_memories``. Each memory row carries three witness
|
||||
flags ``(witness_you, witness_host, witness_guest)``. A retrieval is scoped
|
||||
to a *bot's own memory store* via ``owner_id`` and a *POV role*
|
||||
(``"you"``/``"host"``/``"guest"``); the SQL filter is
|
||||
``WHERE owner_id = ? AND witness_<role> = 1``.
|
||||
|
||||
This module exercises the cross-witness scenarios called out in §"Witnessed-By
|
||||
Tracking" (rp-engine-design.md L108-L116) — multi-witness masks, secondhand
|
||||
provenance, and the per-owner separation that prevents bleed between bots'
|
||||
private memory stores.
|
||||
|
||||
These are tests-only. ``search_memories`` already accepts ``witness_role``,
|
||||
so the cases land green without any production-code change. The host-only
|
||||
hardcode in ``chat/services/prompt.py`` is a separate concern (the v1 prompt
|
||||
builder always queries from the host POV); these tests pin the underlying
|
||||
retrieval contract so a future viewer-aware caller has something to lean on.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from chat.db.connection import open_db
|
||||
from chat.db.migrate import apply_migrations
|
||||
from chat.eventlog.log import append_event
|
||||
from chat.eventlog.projector import project
|
||||
from chat.state.memory import search_memories
|
||||
import chat.state.memory # noqa: F401 (registers memory_written handler)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _seed_memories(db: Path, specs: list[dict]) -> None:
|
||||
"""Apply migrations and project a list of ``memory_written`` events.
|
||||
|
||||
Each spec dict supplies the witness mask + provenance fields explicitly so
|
||||
the test can name the exact mask under test (``[you, host, guest]``).
|
||||
"""
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
for spec in specs:
|
||||
payload = {
|
||||
"owner_id": spec["owner_id"],
|
||||
"chat_id": spec.get("chat_id", "chat_ab"),
|
||||
"pov_summary": spec["pov_summary"],
|
||||
"witness_you": spec["witness_you"],
|
||||
"witness_host": spec["witness_host"],
|
||||
"witness_guest": spec["witness_guest"],
|
||||
"source": spec.get("source", "direct"),
|
||||
"reliability": spec.get("reliability", 1.0),
|
||||
"significance": spec.get("significance", 1),
|
||||
"pinned": 0,
|
||||
"auto_pinned": 0,
|
||||
}
|
||||
append_event(conn, kind="memory_written", payload=payload)
|
||||
project(conn)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Scenario 1 — mask [1, 1, 0]: visible to host, NOT to guest.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_witness_1_1_0_visible_to_host_not_guest(tmp_path):
|
||||
"""A private host moment ([you=1, host=1, guest=0]) must surface for the
|
||||
host's own POV query and stay hidden when the guest queries the same
|
||||
memory store."""
|
||||
db = tmp_path / "t.db"
|
||||
_seed_memories(
|
||||
db,
|
||||
[
|
||||
{
|
||||
"owner_id": "bot_a",
|
||||
"pov_summary": "BotA quietly noticed the broken vase",
|
||||
"witness_you": 1,
|
||||
"witness_host": 1,
|
||||
"witness_guest": 0,
|
||||
},
|
||||
],
|
||||
)
|
||||
with open_db(db) as conn:
|
||||
host_hits = search_memories(conn, "bot_a", "host", "vase", k=4)
|
||||
assert len(host_hits) == 1
|
||||
assert host_hits[0]["pov_summary"] == "BotA quietly noticed the broken vase"
|
||||
|
||||
# Same store, guest POV: filtered out (witness_guest = 0).
|
||||
guest_hits = search_memories(conn, "bot_a", "guest", "vase", k=4)
|
||||
assert guest_hits == []
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Scenario 2 — mask [0, 1, 1]: visible to BOTH host and guest queries.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_witness_0_1_1_visible_to_both_host_and_guest(tmp_path):
|
||||
"""A bot-only side scene ([you=0, host=1, guest=1]) must surface from
|
||||
*both* POV queries against bot stores that recorded it."""
|
||||
db = tmp_path / "t.db"
|
||||
_seed_memories(
|
||||
db,
|
||||
[
|
||||
# bot_a recorded the moment from its own (host) POV.
|
||||
{
|
||||
"owner_id": "bot_a",
|
||||
"pov_summary": "the bots whispered about the secret meeting",
|
||||
"witness_you": 0,
|
||||
"witness_host": 1,
|
||||
"witness_guest": 1,
|
||||
},
|
||||
# bot_b recorded the same moment from its own (guest) POV.
|
||||
{
|
||||
"owner_id": "bot_b",
|
||||
"pov_summary": "the bots whispered about the secret meeting",
|
||||
"witness_you": 0,
|
||||
"witness_host": 1,
|
||||
"witness_guest": 1,
|
||||
},
|
||||
],
|
||||
)
|
||||
with open_db(db) as conn:
|
||||
host_hits = search_memories(conn, "bot_a", "host", "secret", k=4)
|
||||
assert len(host_hits) == 1
|
||||
assert host_hits[0]["owner_id"] == "bot_a"
|
||||
|
||||
guest_hits = search_memories(conn, "bot_b", "guest", "secret", k=4)
|
||||
assert len(guest_hits) == 1
|
||||
assert guest_hits[0]["owner_id"] == "bot_b"
|
||||
|
||||
# Cross-check the "you" POV doesn't pick it up — witness_you = 0.
|
||||
you_hits_a = search_memories(conn, "bot_a", "you", "secret", k=4)
|
||||
you_hits_b = search_memories(conn, "bot_b", "you", "secret", k=4)
|
||||
assert you_hits_a == []
|
||||
assert you_hits_b == []
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Scenario 3 — mask [1, 0, 0]: degenerate "you-only" memory; filtered out for
|
||||
# both bot queries because neither host nor guest witness flag is set.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_witness_1_0_0_filtered_out_for_bot_queries(tmp_path):
|
||||
"""`you` doesn't have a memory store in v1, so a row with only
|
||||
``witness_you = 1`` is degenerate. From either bot POV the filter must
|
||||
drop it (it would only ever surface via a ``"you"`` role query, which
|
||||
isn't a path the v1 prompt builder uses)."""
|
||||
db = tmp_path / "t.db"
|
||||
_seed_memories(
|
||||
db,
|
||||
[
|
||||
{
|
||||
"owner_id": "bot_a",
|
||||
"pov_summary": "you alone caught the slip of the tongue",
|
||||
"witness_you": 1,
|
||||
"witness_host": 0,
|
||||
"witness_guest": 0,
|
||||
},
|
||||
],
|
||||
)
|
||||
with open_db(db) as conn:
|
||||
host_hits = search_memories(conn, "bot_a", "host", "tongue", k=4)
|
||||
guest_hits = search_memories(conn, "bot_a", "guest", "tongue", k=4)
|
||||
assert host_hits == []
|
||||
assert guest_hits == []
|
||||
|
||||
# And a ``you`` POV query still finds it — the row exists, just isn't
|
||||
# reachable from either of the v1 bot retrieval paths.
|
||||
you_hits = search_memories(conn, "bot_a", "you", "tongue", k=4)
|
||||
assert len(you_hits) == 1
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Scenario 4 — secondhand source carries reduced reliability and is still
|
||||
# witness-filtered. Per design.md L114: "BotA tells BotB about it secondhand:
|
||||
# creates a new memory in BotB's store flagged [0,0,1] with source: botA".
|
||||
# We park the mask at [0, 0, 1] (you=0, host=0, guest=1) so that bot_b's
|
||||
# guest-POV query reaches it, and assert reliability < 1.0 surfaces.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_secondhand_memory_visible_with_reduced_reliability(tmp_path):
|
||||
"""A secondhand memory ([0, 0, 1] in bot_b's store, ``source = "told_by:bot_a"``)
|
||||
must surface for bot_b's guest-POV query and carry ``reliability < 1.0``
|
||||
so downstream callers can tag it as hearsay."""
|
||||
db = tmp_path / "t.db"
|
||||
_seed_memories(
|
||||
db,
|
||||
[
|
||||
{
|
||||
"owner_id": "bot_b",
|
||||
"pov_summary": "BotA mentioned a fight at the dockyard",
|
||||
"witness_you": 0,
|
||||
"witness_host": 0,
|
||||
"witness_guest": 1,
|
||||
"source": "told_by:bot_a",
|
||||
"reliability": 0.6,
|
||||
},
|
||||
],
|
||||
)
|
||||
with open_db(db) as conn:
|
||||
hits = search_memories(conn, "bot_b", "guest", "dockyard", k=4)
|
||||
assert len(hits) == 1
|
||||
m = hits[0]
|
||||
assert m["source"] == "told_by:bot_a"
|
||||
assert m["reliability"] < 1.0
|
||||
assert m["reliability"] == 0.6
|
||||
|
||||
# And it's *not* visible from bot_b's host-POV query — bot_b is the
|
||||
# guest in this chat, not the host. The mask enforces that.
|
||||
host_hits = search_memories(conn, "bot_b", "host", "dockyard", k=4)
|
||||
assert host_hits == []
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Scenario 5 — owner separation. Two bots both have [1, 1, 1] memories about
|
||||
# the same event, but the queries are scoped per owner store and must not
|
||||
# bleed across owners.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_owner_separation_no_cross_owner_bleed(tmp_path):
|
||||
"""Each bot only sees memories it OWNS, regardless of witness flags. A
|
||||
fully-witnessed memory in ``bot_a``'s store must not leak into a query
|
||||
against ``bot_b``'s store and vice versa."""
|
||||
db = tmp_path / "t.db"
|
||||
_seed_memories(
|
||||
db,
|
||||
[
|
||||
{
|
||||
"owner_id": "bot_a",
|
||||
"pov_summary": "the lighthouse beam swept across all three of them",
|
||||
"witness_you": 1,
|
||||
"witness_host": 1,
|
||||
"witness_guest": 1,
|
||||
"significance": 2,
|
||||
},
|
||||
{
|
||||
"owner_id": "bot_b",
|
||||
"pov_summary": "the lighthouse beam swept across all three of them",
|
||||
"witness_you": 1,
|
||||
"witness_host": 1,
|
||||
"witness_guest": 1,
|
||||
"significance": 2,
|
||||
},
|
||||
],
|
||||
)
|
||||
with open_db(db) as conn:
|
||||
# bot_a's host-POV query: only bot_a's row.
|
||||
a_hits = search_memories(conn, "bot_a", "host", "lighthouse", k=4)
|
||||
assert len(a_hits) == 1
|
||||
assert a_hits[0]["owner_id"] == "bot_a"
|
||||
|
||||
# bot_b's guest-POV query: only bot_b's row.
|
||||
b_hits = search_memories(conn, "bot_b", "guest", "lighthouse", k=4)
|
||||
assert len(b_hits) == 1
|
||||
assert b_hits[0]["owner_id"] == "bot_b"
|
||||
|
||||
# Even though bot_a's memory is fully witnessed, switching to bot_b's
|
||||
# store with bot_a's POV role still confines us to bot_b's rows.
|
||||
cross_hits = search_memories(conn, "bot_b", "host", "lighthouse", k=4)
|
||||
assert len(cross_hits) == 1
|
||||
assert cross_hits[0]["owner_id"] == "bot_b"
|
||||
+2
-2
@@ -324,11 +324,11 @@ def test_get_scene_returns_none_for_missing(tmp_path):
|
||||
assert active_scene(conn, "chat_missing") is None
|
||||
|
||||
|
||||
def test_schema_version_after_migration_is_7(tmp_path):
|
||||
def test_schema_version_after_migration_is_8(tmp_path):
|
||||
db = tmp_path / "t.db"
|
||||
apply_migrations(db)
|
||||
with open_db(db) as conn:
|
||||
row = conn.execute(
|
||||
"SELECT value FROM meta WHERE key = 'schema_version'"
|
||||
).fetchone()
|
||||
assert int(row[0]) == 7
|
||||
assert int(row[0]) == 8
|
||||
|
||||
Reference in New Issue
Block a user