Phase 01: Data Model, Project Structure, Save Format, and Versioning¶
1. Overview¶
This phase lays the foundation for TheAct by defining everything that exists before a single LLM call is made: how the project code is organized, how games are defined, how saves are structured, how data moves between YAML on disk and Pydantic models in memory, and how git-based versioning gives players unlimited undo.
After this phase is complete, we can: - Define a game entirely in YAML files (world, characters, chapters) - Load those files into validated Pydantic models - Create a new save from a game definition (copy + git init) - Append to the conversation log - Commit after every turn, undo to any previous turn, inspect history - Run tests proving all of the above works
No LLM calls, no CLI, no agents. Just the data layer and the versioning layer.
2. Directory Structure¶
2.1 Project Source Layout¶
theact/
.env # LLM_API_KEY, model config
.env.example
.gitignore
pyproject.toml
main.py # Entry point (Phase 04)
src/
theact/
__init__.py
models/
__init__.py
world.py # World model
character.py # Character model
chapter.py # Chapter model
state.py # GameState model
memory.py # CharacterMemory model
conversation.py # ConversationEntry model
game.py # Game (aggregate root — loads everything)
io/
__init__.py
yaml_io.py # Load/dump YAML, model serialization
save_manager.py # Create save, load save, list saves
versioning/
__init__.py
git_save.py # Git init, commit, undo, history
engine/ # Phase 03
agents/ # Phase 03
llm/ # Phase 02
cli/ # Phase 04
games/
lost-island/ # Example game definition
game.yaml
world.yaml
characters/
maya.yaml
joaquin.yaml
chapters/
01-the-crash.yaml
02-survival.yaml
saves/ # Created at runtime, gitignored from project repo
lost-island-001/ # Each save is its own git repo
...
tests/
__init__.py
test_models.py
test_yaml_io.py
test_save_manager.py
test_git_save.py
conftest.py # Fixtures: sample game data, tmp dirs
2.2 Game Definition (Template)¶
A game definition is a read-only template. It lives under games/<game-id>/ and contains everything needed to start a new playthrough.
games/lost-island/
game.yaml # Metadata: title, description, character list, chapter order
world.yaml # Setting, tone, rules — one short paragraph each
characters/
maya.yaml
joaquin.yaml
chapters/
01-the-crash.yaml
02-survival.yaml
03-the-discovery.yaml
2.3 Save Directory (Runtime)¶
A save is a copy of the game definition plus runtime state. It is a git repository. New files are added as play progresses; game definition files are never modified.
saves/lost-island-001/
.git/ # Git repo — one commit per turn
game.yaml # Copied from game definition (immutable)
world.yaml # Copied from game definition (immutable)
characters/
maya.yaml # Copied from game definition (immutable)
joaquin.yaml # Copied from game definition (immutable)
chapters/
01-the-crash.yaml # Copied from game definition (immutable)
02-survival.yaml # Copied from game definition (immutable)
state.yaml # Created at save init — mutable, updated each turn
conversation.yaml # Created at save init — append-only, one entry per message
memory/
maya.yaml # Created when Maya first acts — rolling memory per character
joaquin.yaml # Created when Joaquin first acts
summaries.yaml # Chapter summaries — appended as chapters complete
3. Data Models¶
All models use Pydantic v2 (BaseModel). Field descriptions serve as documentation. Models define model_config = ConfigDict(extra="forbid") to catch typos in YAML.
3.1 Game Metadata¶
# src/theact/models/game.py
from pydantic import BaseModel, ConfigDict
class GameMeta(BaseModel):
"""Top-level game definition metadata. Loaded from game.yaml."""
model_config = ConfigDict(extra="forbid")
id: str # URL-safe slug, e.g. "lost-island"
title: str # Display name, e.g. "The Lost Island"
description: str # One-sentence pitch
characters: list[str] # Character file stems, e.g. ["maya", "joaquin"]
chapters: list[str] # Chapter file stems in order, e.g. ["01-the-crash", ...]
3.2 World¶
# src/theact/models/world.py
from pydantic import BaseModel, ConfigDict
class World(BaseModel):
"""World definition. Loaded from world.yaml. Keep each field to 1-3 sentences."""
model_config = ConfigDict(extra="forbid")
setting: str # Where and when. ~2 sentences.
tone: str # Narrative voice and style. ~2 sentences.
rules: str # Key constraints the narrator must follow. ~2 sentences.
3.3 Character¶
# src/theact/models/character.py
from pydantic import BaseModel, ConfigDict
class Character(BaseModel):
"""Character definition. Loaded from characters/<name>.yaml. ~60 words total."""
model_config = ConfigDict(extra="forbid")
name: str # Display name
role: str # One-line role in the story
personality: str # Core traits, speech patterns. 2-3 sentences.
secret: str # Hidden motivation or knowledge. 1 sentence.
relationships: dict[str, str] # name -> one-line relationship stance
3.4 Chapter¶
# src/theact/models/chapter.py
from pydantic import BaseModel, ConfigDict
class Chapter(BaseModel):
"""Chapter definition. Loaded from chapters/<id>.yaml."""
model_config = ConfigDict(extra="forbid")
id: str # e.g. "01-the-crash"
title: str # e.g. "The Crash"
summary: str # What this chapter is about. 2-3 sentences.
beats: list[str] # Key events that should happen. Short phrases.
completion: str # What must be true for chapter to end. 1 sentence.
characters: list[str] # Which characters are active in this chapter.
next: str | None = None # Next chapter id, or None if final chapter.
3.5 Game State¶
# src/theact/models/state.py
from pydantic import BaseModel, ConfigDict
class GameState(BaseModel):
"""Mutable game state. Written to state.yaml. Updated every turn."""
model_config = ConfigDict(extra="forbid")
player_name: str
current_chapter: str # Chapter id
turn: int # Monotonically increasing turn counter
beats_hit: list[str] # Beat phrases from current chapter that have occurred
# Reset to [] when chapter advances.
flags: dict[str, str] # Arbitrary key-value pairs set by agents
chapter_history: list[str] # List of completed chapter ids
rolling_summary: str = "" # Incremental summary of expired conversation turns.
# Updated when old turns are trimmed from context window.
# Used by context assembly (Phase 03) to give agents
# a compressed view of earlier events.
3.6 Conversation Entry¶
# src/theact/models/conversation.py
from pydantic import BaseModel, ConfigDict
from typing import Literal
class ConversationEntry(BaseModel):
"""Single message in the conversation log. Appended to conversation.yaml."""
model_config = ConfigDict(extra="forbid")
turn: int
role: Literal["narrator", "character", "player"]
character: str | None = None # Set when role is "character"
content: str
3.7 Character Memory¶
# src/theact/models/memory.py
from pydantic import BaseModel, ConfigDict
class CharacterMemory(BaseModel):
"""Per-character rolling memory. Written to memory/<name>.yaml."""
model_config = ConfigDict(extra="forbid")
character: str # Character name
summary: str # Rolling summary of what this character knows/feels.
# Updated each turn by merging new info into existing summary.
# ~3-5 sentences. Never grows unbounded.
key_facts: list[str] # Important discrete facts. Max ~10, oldest pruned.
3.8 Chapter Summary¶
# src/theact/models/chapter.py (same file, additional model)
class ChapterSummary(BaseModel):
"""Summary of a completed chapter. Appended to summaries.yaml."""
model_config = ConfigDict(extra="forbid")
chapter_id: str
title: str
summary: str # 2-3 sentence summary of what happened.
3.9 Loaded Game (Aggregate)¶
# src/theact/models/game.py (same file, additional model)
class LoadedGame(BaseModel):
"""A fully loaded game with all data resolved. Not persisted — constructed in memory."""
model_config = ConfigDict(extra="forbid")
meta: GameMeta
world: World
characters: dict[str, Character] # keyed by character file stem
chapters: dict[str, Chapter] # keyed by chapter id
state: GameState
conversation: list[ConversationEntry]
memories: dict[str, CharacterMemory] # keyed by character name
chapter_summaries: list[ChapterSummary]
save_path: Path # Absolute path to save directory
# Note: characters and memories dicts are keyed by file stem (e.g. "maya"),
# while Character.name and CharacterMemory.character store display names
# (e.g. "Maya Chen"). The save_manager must handle this mapping. # Absolute path to save directory
4. File Formats¶
Every file uses YAML. Examples below show the target size — these are what a 7B model will actually see in its prompt context. They must be small.
4.1 game.yaml¶
id: lost-island
title: The Lost Island
description: Survivors of a plane crash on an uncharted island discover something ancient beneath the jungle.
characters:
- maya
- joaquin
chapters:
- 01-the-crash
- 02-survival
- 03-the-discovery
4.2 world.yaml¶
setting: >
Uncharted volcanic island, South Pacific, 2024. Survivors of Flight NZ-417
crashed into the northern reef. No GPS, no radio, no rescue coming.
tone: >
Second person, present tense. Sensory-first narration. Slow-burn tension
through small wrong details. 150-300 words per turn.
rules: >
The supernatural is always ambiguous — every anomaly has a mundane explanation.
NPCs remember everything. Never break the second-person frame.
4.3 characters/maya.yaml (~60 words)¶
name: Maya Chen
role: Fellow crash survivor, pragmatic engineer, player's primary ally.
personality: >
Direct, sharp, dry humor. Speaks in short declarative sentences.
Competence is her coping mechanism — idle hands make her anxious.
Slow to trust, fiercely loyal once earned.
secret: Racing home to her estranged mother who has cancer.
relationships:
joaquin: "Respects his calm but distrusts his evasiveness."
4.4 characters/joaquin.yaml (~60 words)¶
name: Father Joaquin Reyes
role: Mysterious priest who has been to this island before.
personality: >
Calm, cryptic, speaks in parables and questions. Genuinely kind but
evasive about specifics. Gets quieter when most serious.
Never raises his voice.
secret: Visited this island 40 years ago. His companions entered the caves and never returned.
relationships:
maya: "Admires her strength but worries about her refusal to accept mystery."
4.5 chapters/01-the-crash.yaml¶
id: 01-the-crash
title: The Crash
summary: >
Player wakes alone on the beach amid wreckage. They explore, find evidence
of death, and discover Maya — the first sign they are not alone.
beats:
- Player wakes on the beach, disoriented and injured
- Explores crash debris and finds supplies
- Discovers a body — death is real here
- Finds Maya working alone beyond the headland
- Together they establish a basic camp
- First night — strange sounds from the jungle
completion: Player and Maya have established camp and survived the first night.
characters:
- maya
next: 02-survival
4.6 state.yaml (created at save init)¶
player_name: ""
current_chapter: 01-the-crash
turn: 0
beats_hit: []
flags: {}
chapter_history: []
rolling_summary: ""
4.7 conversation.yaml (created at save init, append-only)¶
- turn: 1
role: narrator
content: >
You open your eyes to white light and the taste of salt. Sand grinds
against your cheek. Your left arm is pinned under something heavy.
Somewhere beyond the ringing in your ears, waves break against metal.
- turn: 1
role: player
content: I try to free my arm and look around.
- turn: 1
role: narrator
content: >
You wrench your arm free — a duffel bag, half-buried. Sitting up sends
the world tilting. Beach. Wreckage. No engines. No voices. Just the reef
and the wind and the cry of a single bird circling overhead.
- turn: 2
role: character
character: Maya Chen
content: >
She looks up sharply, a strip of seatbelt dangling from one hand. Her
eyes sweep you head to toe — assessing, not greeting. "You're bleeding.
How many others did you see?"
4.8 memory/maya.yaml (created during play)¶
character: Maya Chen
summary: >
Met the player on the beach after the crash. They seem resourceful —
found supplies before finding her. Worked together to build camp.
Heard the drums that first night but neither of us spoke about it.
key_facts:
- Player found a lighter and first-aid kit
- Camp is in the fuselage section on north beach
- Strange drumming sound from jungle at nightfall
4.9 summaries.yaml (appended as chapters complete)¶
- chapter_id: 01-the-crash
title: The Crash
summary: >
Player woke amid the wreckage of Flight NZ-417. Found Maya beyond the
headland. Together they built a camp in the fuselage. That first night,
they both heard drums from the jungle but said nothing.
5. YAML Serialization¶
5.1 yaml_io.py Interface¶
# src/theact/io/yaml_io.py
from pathlib import Path
from typing import TypeVar, Type
from pydantic import BaseModel
import yaml
T = TypeVar("T", bound=BaseModel)
def load_yaml(path: Path, model: Type[T]) -> T:
"""Load a YAML file and validate it against a Pydantic model."""
with open(path) as f:
data = yaml.safe_load(f)
return model.model_validate(data)
def load_yaml_list(path: Path, model: Type[T]) -> list[T]:
"""Load a YAML file containing a list and validate each item."""
with open(path) as f:
data = yaml.safe_load(f)
if data is None:
return []
return [model.model_validate(item) for item in data]
def dump_yaml(path: Path, model: BaseModel) -> None:
"""Serialize a Pydantic model to YAML and write to file."""
path.parent.mkdir(parents=True, exist_ok=True)
with open(path, "w") as f:
yaml.dump(
model.model_dump(exclude_none=True),
f,
default_flow_style=False,
allow_unicode=True,
sort_keys=False,
)
def dump_yaml_list(path: Path, models: list[BaseModel]) -> None:
"""Serialize a list of Pydantic models to YAML and write to file."""
path.parent.mkdir(parents=True, exist_ok=True)
data = [m.model_dump(exclude_none=True) for m in models]
with open(path, "w") as f:
yaml.dump(
data,
f,
default_flow_style=False,
allow_unicode=True,
sort_keys=False,
)
def append_yaml_entry(path: Path, model: BaseModel) -> None:
"""Append a single model as a YAML list entry to an existing file.
If the file doesn't exist or is empty, creates it with a single-item list.
If the file exists, loads the list, appends, and rewrites.
Note: For conversation.yaml, this means a full rewrite on each append.
At 500 turns with ~4 messages per turn, that's ~2000 entries — still fast
since each entry is a few lines of text. If this ever becomes a bottleneck,
we can switch to streaming YAML output, but premature optimization is
not warranted here.
"""
if path.exists():
with open(path) as f:
data = yaml.safe_load(f) or []
else:
path.parent.mkdir(parents=True, exist_ok=True)
data = []
data.append(model.model_dump(exclude_none=True))
with open(path, "w") as f:
yaml.dump(
data,
f,
default_flow_style=False,
allow_unicode=True,
sort_keys=False,
)
6. Save Manager¶
6.1 save_manager.py Interface¶
# src/theact/io/save_manager.py
from pathlib import Path
from theact.models.game import GameMeta, LoadedGame
from theact.models.state import GameState
# Resolved relative to project root. Callers (including tests) can override
# by passing explicit paths to create_save/load_save.
GAMES_DIR = Path("games")
SAVES_DIR = Path("saves")
def list_games() -> list[GameMeta]:
"""List all available game definitions."""
...
def list_saves() -> list[dict]:
"""List all existing saves with basic info.
Returns list of dicts with keys: id, game_title, turn, last_modified.
"""
...
def create_save(game_id: str, save_id: str, player_name: str) -> Path:
"""Create a new save from a game definition.
1. Copy game definition files to saves/<save_id>/
2. Create initial state.yaml with player_name and turn=0
3. Create empty conversation.yaml
4. Create empty summaries.yaml
5. Create memory/ directory
6. Initialize git repo and make initial commit
Returns the save directory path.
"""
...
def load_save(save_id: str) -> LoadedGame:
"""Load a complete save into memory.
Reads all YAML files, validates against Pydantic models,
returns a LoadedGame aggregate.
"""
...
def save_state(save_path: Path, state: GameState) -> None:
"""Write updated game state to state.yaml."""
...
def append_conversation(save_path: Path, entry: "ConversationEntry") -> None:
"""Append a message to conversation.yaml."""
...
def save_memory(save_path: Path, memory: "CharacterMemory") -> None:
"""Write updated character memory to memory/<name>.yaml."""
...
def save_summaries(save_path: Path, summaries: list["ChapterSummary"]) -> None:
"""Write the full list of chapter summaries to summaries.yaml (full rewrite).
Caller must load existing summaries, append, and pass the complete list.
"""
...
7. Save Versioning (Git)¶
7.1 Design¶
Each save directory is an independent git repository. One commit is created after each complete turn (narrator response + all character responses + player input + memory updates + state update). This means:
- Every file change within a turn is captured atomically in one commit
- Undo =
git reset --hard HEAD~1(discard last commit) - Undo N turns =
git reset --hard HEAD~N - History =
git log --onelinegives a turn-by-turn timeline - Hundreds of turns are efficient because git stores diffs, not full copies
- The conversation.yaml grows linearly but git only stores the appended portion as a diff
7.2 git_save.py Interface¶
# src/theact/versioning/git_save.py
from pathlib import Path
from dataclasses import dataclass
from git import Repo # gitpython
@dataclass
class TurnInfo:
"""Summary of a single turn from git history.
Uses dataclass (not BaseModel) since this is a transient data carrier,
never serialized to YAML.
"""
turn: int
commit_hash: str
message: str
timestamp: str
def init_repo(save_path: Path) -> Repo:
"""Initialize a git repo in the save directory and make the initial commit.
The initial commit contains the game definition files and empty state/conversation.
Commit message: 'New game: <game title>'
"""
...
def commit_turn(save_path: Path, turn: int, summary: str) -> str:
"""Stage all changes and commit.
Commit message format: 'Turn <N>: <one-line summary>'
The summary is a brief description derived from the narrator's output.
Returns the commit hash.
"""
...
def undo(save_path: Path, steps: int = 1) -> int:
"""Undo the last N turns by resetting to HEAD~N.
Returns the turn number we've rewound to.
Raises ValueError if steps exceeds available history (excluding initial commit).
"""
...
def get_history(save_path: Path) -> list[TurnInfo]:
"""Get the full turn history from git log.
Returns most recent first. Excludes the initial 'New game' commit.
Parses turn number from commit message.
"""
...
def get_turn_count(save_path: Path) -> int:
"""Get the number of completed turns (number of turn commits)."""
...
7.3 Commit Strategy¶
Commit 0: "New game: The Lost Island" ← initial state, all game files
Commit 1: "Turn 1: Player wakes on beach" ← conversation + state changes
Commit 2: "Turn 2: Found Maya" ← conversation + state + memory
Commit 3: "Turn 3: Built camp together" ← conversation + state + memory
...
Each turn commit includes all files that changed during that turn: - conversation.yaml (always — new messages appended) - state.yaml (always — turn counter incremented, possibly beats_hit/flags updated) - memory/<name>.yaml (if a character's memory was updated) - summaries.yaml (if a chapter was completed this turn)
7.4 Undo Semantics¶
Undo is destructive — it discards commits. This is intentional. The player is rewinding time; the future that was discarded should not persist. If we ever want a "redo" feature, we can stash the branch ref before resetting, but that is not in scope for Phase 01.
The player is now back at the state after turn B. Their next action creates a new turn C' that diverges from the original timeline.
8. Implementation Steps¶
Build in this order. Each step should produce working, tested code before moving to the next.
Step 1: Project scaffolding¶
- Create the
src/theact/package structure with__init__.pyfiles - Create
models/,io/,versioning/sub-packages - Create
tests/directory withconftest.py - Add
saves/andplaytests/to.gitignore - Add dependencies to
pyproject.toml - Verify
uv syncworks
Step 2: Pydantic models¶
- Implement all models from Section 3 in their respective files
- Write
tests/test_models.py: - Test that each model validates correct data
- Test that each model rejects extra fields (
extra="forbid") - Test optional fields default correctly
- Test that example YAML data from Section 4 round-trips through models
Step 3: YAML I/O¶
- Implement
yaml_io.pywith all functions from Section 5 - Write
tests/test_yaml_io.py: - Test
load_yaml/dump_yamlround-trip for each model - Test
load_yaml_list/dump_yaml_listfor list-based files - Test
append_yaml_entrycreates file if missing - Test
append_yaml_entryappends to existing file - Test error handling: missing file, invalid YAML, validation failure
Step 4: Example game definition¶
- Create
games/lost-island/with all YAML files from Section 4 - Manually verify they load through the YAML I/O layer
- This serves as both test fixture and the Phase 05 starting point
Step 5: Save manager¶
- Implement
save_manager.pywith all functions from Section 6 - Write
tests/test_save_manager.py: - Test
create_saveproduces correct directory structure - Test
create_savecopies all game files - Test
create_savecreates valid initial state - Test
load_savereturns a fully populatedLoadedGame - Test
list_savesfinds existing saves - Test
save_state,append_conversation,save_memorywrite correctly - Test round-trip: create_save -> modify -> load_save sees modifications
Step 6: Git versioning¶
- Implement
git_save.pywith all functions from Section 7 - Write
tests/test_git_save.py: - Test
init_repocreates a valid git repo with initial commit - Test
commit_turncreates commits with correct messages - Test
undorewinds state correctly (verify file contents revert) - Test
undoraises on over-rewinding - Test
get_historyreturns correct turn list - Test multi-turn sequence: 10 turns, undo 3, verify state at turn 7
- All tests use
tmp_pathfixture to avoid polluting the real filesystem
Step 7: Integration¶
- Wire
save_manager.create_saveto callgit_save.init_repo - Ensure
save_manager.load_saveworks on git-versioned saves - Write integration test: create game -> make 5 turns of fake data -> undo 2 -> verify state
- Verify
.gitignorein the project root excludessaves/from the project's own git repo
9. Verification¶
Phase 01 is complete when all of the following pass:
uv run pytest tests/— all tests green- Manual check:
games/lost-island/contains valid YAML that loads without errors - Manual check: Creating a save from lost-island produces correct directory structure
- Manual check: After 5 simulated turns,
git login the save dir shows 6 commits (1 initial + 5 turns) - Manual check: Undo 2 turns, verify conversation.yaml has only 3 turns of content
- Manual check: File sizes — world.yaml < 500 bytes, each character YAML < 400 bytes, each chapter YAML < 500 bytes
9.1 Live Testing & Regression Capture¶
After unit and manual tests pass, perform the following live validation cycle. The goal is to discover edge cases in real usage and lock them down with automated tests so they never regress.
Step 1 — Exploratory testing: - Create a save from Lost Island, manually edit game files (introduce typos, missing fields, extra fields), and verify Pydantic rejects them with clear errors. - Create a save, simulate 10 turns of fake conversation data (use a script), then undo 5, undo 3 more, verify state is correct after each undo. - Test with unusually long player input (500+ words) in a conversation entry. Verify YAML round-trips correctly (no truncation, no encoding issues). - Test with special characters in player input (quotes, colons, newlines, unicode) — these can break YAML serialization.
Step 2 — Fix and capture: - For every issue discovered, fix the code, then write a regression test in tests/ that reproduces the exact failure scenario. - Example: if YAML serialization breaks on a colon in player input, write test_conversation_entry_with_colon() that appends an entry containing "Where are we: lost?" and verifies it round-trips.
Step 3 — Verify regression suite: - Run uv run pytest tests/ -v and confirm all new regression tests pass alongside existing tests.
10. Dependencies¶
Add to pyproject.toml:
dependencies = [
"openai>=2.29.0",
"python-dotenv>=1.2.2",
"pydantic>=2.10.0",
"pyyaml>=6.0",
"gitpython>=3.1.0",
]
[project.optional-dependencies]
dev = [
"pytest>=8.0",
]
Summary of new packages: - pydantic — Data validation and model definitions - pyyaml — YAML parsing and serialization - gitpython — Git operations for save versioning - pytest — Test runner (dev dependency)