Trust model¶
Plain-English summary. reposix mixes private data, attacker-influenced text, and a way to send things over the network — exactly the recipe for the kind of agent attack that exfiltrates your secrets through a poisoned issue body. This page lays out the threat (yes, it's real), the cuts the design uses to defang it (the type system, an outbound allowlist, an append-only audit table, a frontmatter sanitiser), and the specific things that are intentionally not mitigated so you don't expect a sandbox you don't have.
reposix is, by construction, a textbook lethal trifecta machine — Simon Willison's name for the three legs an exfiltration attack against an LLM agent needs at the same time. From agentic-engineering-reference.md, the legs are:
- Private data — the agent sees issue bodies, custom fields, attachments, internal comments. Anything an authenticated REST call can return.
- Untrusted input — every issue body, comment, title, and label is attacker-influenced text. So is anything seeded into the simulator.
- Exfiltration —
git pushis a side-effecting verb that can target arbitrary remotes; the helper makes outbound HTTP calls to backends.
You cannot build reposix without all three legs being present. So instead of pretending one of them isn't there, the design cuts the path between them at every boundary. The three keys from Mental model in 60 seconds ground in this page: the cache layer is where taint enters; the helper is where egress and audit happen; the frontmatter allowlist is where untrusted input meets server-controlled fields.
Concentric rings — taint in, audited bytes out¶
Three rings, three cuts. A byte from the network does not reach a side-effecting call without crossing a boundary that the type system or the runtime can audit.
Mitigations table¶
| Trifecta leg | Cut | Where it lives |
|---|---|---|
| Private data | Egress allowlist REPOSIX_ALLOWED_ORIGINS — the single choke-point that decides which outbound origins are allowed. Every HTTP client built through reposix_core::http::client(), no direct reqwest::Client::new() (clippy disallowed_methods enforces). Default: http://127.0.0.1:*. |
crates/reposix-core/src/http.rs, runtime check in crates/reposix-cache |
| Private data | Blob limit REPOSIX_BLOB_LIMIT (default 200) — caps unbounded git fetch runs that would page in an entire backend. |
helper, see git layer §blob limit |
| Untrusted input | Frontmatter field allowlist — id, created_at, version, updated_at stripped from inbound writes before the REST call. An attacker-authored body with version: 999999 cannot poison the server version. |
helper push handler, audited as helper_push_sanitized_field |
| Untrusted input | Tainted<T> ↔ Untainted<T> newtype pair (a Rust pattern that uses the type system to track which bytes came from a remote and refuses to let them reach an egress sink without an explicit conversion) — the cache returns Tainted<Vec<u8>>; sanitize() is the only safe conversion. A trybuild compile-fail test asserts you cannot send a Tainted<T> to an egress sink without sanitize. |
crates/reposix-core/src/tainted.rs |
| Exfiltration | Push-time conflict detection — rejects stale-base pushes with error refs/heads/main fetch first. Side effect: prevents a stale agent from blindly overwriting a backend write that landed between its clone and its push. |
helper, see git layer §push-time conflict detection |
| Exfiltration | Append-only audit log — BEFORE UPDATE/DELETE RAISE triggers on audit_events_cache so an attacker who reaches sqlite3 cannot tamper with history without the alarm row showing up. |
crates/reposix-cache/src/cache_schema.sql |
Audit log¶
Every network-touching action writes one row to one of two append-only audit tables — by design, not by accident.
audit_events_cachelives incache.dband is owned byreposix-cache::audit. It records cache-internal events: blob materialization, helper RPC turns, gc eviction, sync-tag writes, push accept/reject decisions.audit_eventslives in the per-backend audit DB and is owned byreposix-core::audit(written by the sim/confluence/jira adapters). It records backend-side mutating REST calls: everycreate_record,update_record, anddelete_or_close.
A complete forensic query (e.g., "which JIRA write came from which git push?") joins both: the helper-level helper_push_accepted row lives in audit_events_cache; the per-issue update_record rows live in audit_events. Both tables use SQLite WAL (write-ahead logging, where writes go to a separate file and merge into the main DB on checkpoint, so readers don't block writers) and both are append-only at the SQL level: BEFORE UPDATE and BEFORE DELETE triggers raise rather than allow row mutations. The split keeps reposix-cache strictly cache-side and lets backend adapters fail closed without coupling to the cache's SQLite connection lifecycle (POLISH2-22 friction row 12; physical unification deferred to v0.12.0+).
The audit_events_cache ops vocabulary is fixed:
op |
Written when |
|---|---|
materialize |
Cache lazy-fetched a blob from the backend |
egress_denied |
Outbound call refused by REPOSIX_ALLOWED_ORIGINS |
delta_sync |
Helper ran list_changed_since(last_fetched_at) |
helper_connect, helper_advertise, helper_fetch, helper_fetch_error |
Helper protocol events on the read side |
helper_push_started, helper_push_accepted, helper_push_rejected_conflict, helper_push_sanitized_field |
Helper protocol events on the write side |
blob_limit_exceeded |
command=fetch carried more want lines than REPOSIX_BLOB_LIMIT |
git log is the agent's intent; audit_events_cache is the system's outcome. Together they answer "what was attempted, what was allowed, and what hit the network" without any side-channel logging.
What's NOT mitigated¶
Honesty about the threat model is a feature, not a footnote.
- Shell access bypasses every cut. An attacker on the dev host can
curlthe backend directly with the same token. reposix is a substrate for safer agent loops — it is not a sandbox. The egress allowlist guards the helper and the cache; it does not guard the rest of the host. - The simulator is itself attacker-influenced. Seed data is authored by an agent (or by a fixture written by an agent), so simulator runs are also tainted. The lethal-trifecta mitigations apply against the simulator just as hard as against a real backend.
- Token leakage via crash logs. A panicking helper that includes auth headers in its
RUST_BACKTRACEoutput can leak credentials. The codebase scrubs known credential headers before logging, but a third-party crate panicking with a header in scope is out of reposix's hands. - Confused-deputy across backends. A user with credentials for two backends and one allowlist entry can be tricked by a tainted issue body into directing writes at the wrong backend. The allowlist constrains origin; it does not constrain intent. Multi-backend egress is high-friction by design — the agent must run a separate
reposix initper backend. - Cache compromise. An attacker with write access to
cache.dbcan replay or hide audit rows from older WAL segments. Append-only triggers prevent in-place tampering on the live segment but cannot defend against the file being swapped wholesale.
Further reading¶
- Filesystem layer ← — where tainted bytes enter the system.
- Git layer ← — where the conflict and blob-limit cuts are wired.
.planning/research/v0.1-fuse-era/threat-model-and-critique.md— full historical threat model, kept in the planning tree (not user-facing nav).docs/research/agentic-engineering-reference.md— the lethal-trifecta framing.