feat: narrative format — third-person asterisk-action style with concrete-beat example

Rewrites the closing instruction in assemble_narrative_prompt to enforce
the asterisk-action / interleaved-beat format: actions wrapped in
*asterisks* in third person, dialogue as plain text between beats (no
quote marks), 2-4 short concrete beats per response, no inner monologue
or stage-direction adverbs. Includes a one-line worked example so the
model has a concrete target.

Was producing first-person prose blocks like 'I stare at you... "Well,
that's direct," I murmur'. Target style is short interleaved beats:
'*She turns with soapy hands to cup your face* That's how I know it's
real... *She kisses you softly* You love me when I'm messy...'

Drops narrative_max_tokens 400 -> 250 so the model can't drift into
multi-paragraph monologue. Bumps three test budgets to fit the larger
closing (closing grew ~80 -> ~200 tokens; tests still exercise the
same trim-order behavior, just with proportionally larger budgets).
This commit is contained in:
Joseph Doherty
2026-04-27 12:21:03 -04:00
parent fe9c497038
commit d656ee8805
3 changed files with 42 additions and 16 deletions
+7 -3
View File
@@ -23,9 +23,13 @@ class Settings(BaseModel):
retrieval_k: int = 4
narrative_budget_hard: int = 8000
narrative_budget_soft: int = 6000
# Cap on each generated bot response. ~400 tokens ≈ 12 short paragraphs.
# Bump if you want longer scenes; drop to 200 for terse banter.
narrative_max_tokens: int = 400
# Cap on each generated bot response. The asterisk-action format
# (see ``_closing_instruction`` in chat/services/prompt.py) targets
# 2-4 short interleaved action+dialogue beats; ~250 tokens fits that
# without leaving room for the model to drift into multi-paragraph
# inner-monologue prose. Bump back up if you want longer scenes;
# drop to 150 for very terse banter.
narrative_max_tokens: int = 250
# Sampling temperature for narrative generation. 0.7 = grounded /
# consistent; 0.85 = creative-but-in-character (default); 1.0 = wide
# variety, can drift; >1.0 = often off-the-rails.
+17 -6
View File
@@ -327,12 +327,23 @@ def _build_open_threads_block(threads: list[dict]) -> str | None:
def _closing_instruction(speaker_name: str, addressee_name: str) -> str:
return (
f"Continue the scene as {speaker_name}, in their voice, responding "
"naturally. Use *asterisks* for actions and quotes for dialogue. "
f"Stay in character. Do not narrate {addressee_name}'s actions or "
"thoughts. "
"Keep your response to a single beat — one or two short paragraphs "
"at most. Don't monologue; leave room for the other person to react."
f"Continue as {speaker_name}. Format strictly:\n"
f"- Wrap actions and gestures in *asterisks*, third person "
f"({speaker_name}/she/he/they) — never first person, never inner "
"thoughts inside asterisks.\n"
"- Speak dialogue as plain text between action beats, no quote "
"marks. Keep speech fragmented, not paragraphs.\n"
"- Interleave 2-4 short beats (action, brief speech, action, brief "
"speech). Each beat is one concrete gesture or sensory image — no "
"explanation, no inner monologue, no stage-direction adverbs.\n"
"- Trailing ellipses (...) are fine for emotional weight.\n"
"Example: *She turns with soapy hands to cup your face* That's how "
"I know it's real... *She kisses you softly* You love me when I'm "
"messy... *She rests her forehead against yours* ...and every "
"moment in between.\n"
f"Show only what {addressee_name} could externally observe of "
f"{speaker_name}; never narrate {addressee_name}'s actions or "
"thoughts. One response — leave room to react."
)