feat: narrative format — third-person asterisk-action style with concrete-beat example

Rewrites the closing instruction in assemble_narrative_prompt to enforce the asterisk-action / interleaved-beat format: actions wrapped in *asterisks* in third person, dialogue as plain text between beats (no quote marks), 2-4 short concrete beats per response, no inner monologue or stage-direction adverbs. Includes a one-line worked example so the model has a concrete target. Was producing first-person prose blocks like 'I stare at you... "Well, that's direct," I murmur'. Target style is short interleaved beats: '*She turns with soapy hands to cup your face* That's how I know it's real... *She kisses you softly* You love me when I'm messy...' Drops narrative_max_tokens 400 -> 250 so the model can't drift into multi-paragraph monologue. Bumps three test budgets to fit the larger closing (closing grew ~80 -> ~200 tokens; tests still exercise the same trim-order behavior, just with proportionally larger budgets).
2026-04-27 12:21:03 -04:00
parent fe9c497038
commit d656ee8805
3 changed files with 42 additions and 16 deletions
@@ -565,8 +565,12 @@ def test_tight_budget_drops_guest_activity_bullet_first(tmp_path):
            speaker_bot_id="bot_a",
            recent_dialogue=dialogue,
            retrieved_memory_summaries=[],
-            budget_soft=250,
-            budget_hard=340,
+            # Closing instruction grew with the asterisk-format spec
+            # (Phase 4.6 narrative-style fix). Budget bumped enough to
+            # accommodate the larger MUST floor while still exercising
+            # the SHOULD-tier trim path.
+            budget_soft=440,
+            budget_hard=460,
        )
    body = msgs[0].content
    # Speaker bullet survives (MUST-tier floor).
@@ -696,13 +700,15 @@ def test_nice_trim_order_documented(tmp_path):
        # Soft tuned so the all-NICE config (with the heavy previous
        # scene summary) overflows, but dropping just previous-scene
        # fits comfortably. Hard set high so SHOULD-tier never trims.
+        # Soft bumped (was 400) to make room for the larger closing
+        # instruction shipped with the asterisk-format spec.
        msgs = assemble_narrative_prompt(
            conn,
            chat_id="chat_bot_a",
            speaker_bot_id="bot_a",
            recent_dialogue=dialogue,
            retrieved_memory_summaries=memories,
-            budget_soft=400,
+            budget_soft=540,
            budget_hard=8000,
        )
    body = msgs[0].content
@@ -748,8 +754,12 @@ def test_assemble_with_tight_budget_drops_guest_activity_first(tmp_path):
            # group node + other edges) push it well over 380. budget_hard
            # is set just above MUST core so SHOULD-tier blocks must be
            # trimmed away.
-            budget_soft=250,
-            budget_hard=340,
+            # Closing instruction grew with the asterisk-format spec
+            # (Phase 4.6 narrative-style fix). Budget bumped enough to
+            # accommodate the larger MUST floor while still exercising
+            # the SHOULD-tier trim path.
+            budget_soft=440,
+            budget_hard=460,
        )
    body = msgs[0].content
    # MUST: speaker identity, edge to addressee, last 4 dialogue turns.
@@ -759,10 +769,11 @@ def test_assemble_with_tight_budget_drops_guest_activity_first(tmp_path):
        assert f"line-{i:02d}" in body
    # Guest activity (SHOULD-tier) must be dropped under tight budget.
    assert "smirking-distinctively" not in body
-    # Token budget honoured.
+    # Token budget honoured. Bumped (was 340) for the larger closing
+    # instruction that ships the asterisk-format spec.
    import tiktoken
    enc = tiktoken.get_encoding("cl100k_base")
-    assert len(enc.encode(body)) <= 340
+    assert len(enc.encode(body)) <= 460


 # ---------------------------------------------------------------------------