Debugging¶

The turn debugger lets you step through agent calls one at a time, inspect prompts and responses, edit prompts and replay without restarting, and capture test fixtures. It wraps agents directly — calling the same run_narrator(), run_character(), etc. that run_turn() uses — so there is no "debugger mode" that could mask bugs.

Quick Start¶

uv run python scripts/debug_turn.py --game lost-island --save my-save --input "I look around."

This loads the game state from the specified save, then pauses before each agent call so you can inspect and interact.

Interactive Commands¶

Command	Key	Description
Step	`s`	Execute current agent call, pause at next
Replay	`r`	Re-run current agent (same inputs, new LLM call)
Edit	`e`	Reload prompts from disk and replay
Inspect	`i`	Show prompt (`prompt`), response (`response`), parsed output (`parsed`), stats (`stats`), or all (`all`)
Skip	`k`	Skip current agent, move to next
Continue	`c`	Run all remaining agents without pausing
Fixture	`p`	Capture prompt/response as test fixture
Compare	`pa`	Compare current response against previous replay
Quit	`q`	Abort and exit

Workflow: Fixing a Prompt¶

flowchart TD
    A["Run debugger with\nproblem input"] --> B["Step to\nfailing agent"]
    B --> C["Inspect prompt\n(i then prompt)"]
    C --> D["Inspect response\n(i then response)"]
    D --> E["Edit prompts.py\nin your editor"]
    E --> F["Press 'e' to\nreload + replay"]
    F --> G{"Output\ncorrect?"}
    G -->|No| E
    G -->|Yes| H["Compare (pa)\nagainst old output"]
    H --> I["Capture fixture (p)\nfor regression test"]

How Edit+Replay Works¶

The debugger uses importlib.reload() to hot-reload prompt changes. It must reload both prompts.py and context.py because context.py imports constants at import time.

The sequence on pressing e:

User edits src/theact/agents/prompts.py in their editor
Debugger reloads the prompts module
Debugger reloads the context module (picks up new constants)
Agent call replays with the updated prompt

This gives a tight edit-test loop without restarting the process or re-running earlier agents.

Fixture Capture¶

Pressing p calls capture_fixture(), which saves the full AgentResult (messages, raw response, parsed data, tokens) as YAML in tests/fixtures/.

These fixtures feed into test_prompt_regression.py — debug a problem, capture the failing case, fix the prompt, and the captured fixture becomes a regression test.

Replay Mode¶

Walk through historical turns from a save that was run with debug=True (uses the diagnostics filesystem):

uv run python scripts/debug_turn.py --save my-save --replay

Key	Action
`Enter`	Next turn
`p`	Previous turn
`j N`	Jump to turn N
`d A B`	Diff turn A against turn B
`q`	Quit

Design¶

The debugger wraps agents, not the engine. It requires a real game save — loading game state, characters, and conversation history from a save directory. There is no mock mode. This means the debugger exercises the exact same code paths as production, so any fix validated here works in the real game loop.

Troubleshooting¶

Symptom	What to Check	Tool
Empty narrator response	System prompt too long	Context profiler
Narrator outputs prose instead of YAML	YAML hint missing or weak	Inspect prompt (`i` then `prompt`)
Character breaks voice	Personality too vague	Check character YAML file
Memory agent hallucinates facts	Turn events include wrong characters	Inspect memory prompt
Game state never completes chapter	Completion condition too strict	Check chapter YAML file
Model echoes prompt back	Context overflow	Context profiler

Key Files¶

File	Contents
`src/theact/debugger/debugger.py`	`TurnDebugger` class
`src/theact/debugger/types.py`	`AgentResult`, `DebugStep`, `DebugSession`
`scripts/debug_turn.py`	CLI entry point