Replace the substring _detect_addressee_id helper with a classifier
call for the multi-entity case. The substring helper is kept as a
fast-path for the no-guest case (no LLM round-trip needed when only
one bot is present, preserves throughput).
- New service chat/services/addressee.py wrapping the existing
classifier wrapper. AddresseeDecision carries addressee_id +
confidence + reason; classifier failure falls back to the host with
reason="fallback" (graceful-degradation, matches the relationship_seed
/ interjection pattern).
- chat/web/turns.py post_turn now calls detect_addressee in the
multi-entity branch; 1:1 keeps the substring path.
- tests/test_addressee.py: 3 new tests (guest pick, host pick,
classifier-failure fallback).
- tests/test_turn_flow.py: existing multi-entity tests now feed a
canned addressee response in the queue. The addressee-routing test
is updated to assert classifier-driven routing rather than substring.
Rewrites post_turn for the multi-entity world:
- Addressee detection via case-insensitive whole-word match against the
guest name; defaults to host on no-match or both-match.
- Multi-entity prompt assembly: forwards guest_id so the prompt sees
the third party's activity / edges / group-node.
- Multi-witness memory write: record_turn_memory_for_present writes one
memory per present bot witness when a guest is in the room.
- Multi-pair state-update: compute_state_updates_for_present emits one
edge_update per directed pair (6 with a guest, 2 without).
- Interjection branch (T39): when a guest is present and the primary
beat completes, the silent witness may follow on. detect_interjection
decides; on True we stream a second narrative as the witness, append a
second assistant_turn linked to the same user_turn_id, and re-run the
multi-pair state update + memory write for the follow-on beat. Cancel
collapses both halves; a cancelled interjection skips its downstream
passes so we don't classifier-spam against a half-formed beat.
- Scene-close runs after both beats so apply_scene_close_summary sees
the full closing scene; T45's guest-aware summarizer handles per-POV
rewrites for each present witness.
regenerate.py mirrors the prompt / memory / state-update changes for
1:1 and multi-entity scenes. Per the Phase 2 spec, interjection
regeneration is deferred to Phase 2.5 — regenerate only re-streams the
addressee turn for v2.
Tests: adds 5 cases to tests/test_turn_flow.py covering the no-guest
regression, multi-bot without interjection, multi-bot with interjection,
scene-close per-POV rewrites, and addressee routing on a named-bot
prose. Each test pins its own canned MockLLMClient queue with the call
shape documented in the docstring.