# open traces > Open-source CLI for repo-local agent trace capture, review, and upload to Hugging Face Hub. React inbox, terminal inbox, and structured JSONL schema. ## Links - Documentation: https://opentraces.ai/docs - GitHub: https://github.com/jayfarei/opentraces - Explorer: https://opentraces.ai/explorer - Schema: https://opentraces.ai/schema ## Full Documentation --- # opentraces Open schema + CLI for capturing coding-agent traces, reviewing them locally, and publishing structured datasets to Hugging Face Hub. opentraces is built around a simple rule: capture locally, review locally, push explicitly. The tool parses agent sessions into a stable schema, runs layered security scanning and redaction, enriches each trace with git and attribution signals, and uploads sharded JSONL to a dataset you control. ## Workflow ```bash opentraces setup # wire opentraces into your system opentraces init # initialize the project marker opentraces web # or: opentraces tui — review the inbox opentraces blame # or: opentraces graph — inspect attribution opentraces add # stage a trace for the next push opentraces push # upload staged traces opentraces reject # say no, keep local only opentraces redact # find-and-replace before re-pushing opentraces push # upload opentraces pull # import traces from a remote dataset ``` `init` writes the committable project marker at `.opentraces.json`. Captured traces, runtime state, and upload bookkeeping stay machine-local under `~/.opentraces/projects//`. ## What You Get **For individual developers.** A local inbox for reviewing traces before upload, plus a standard dataset format you can publish privately or publicly. **For teams.** Shared remotes on Hugging Face, explicit review policy per repo, and deterministic upload shards that never append in place. **For dataset consumers.** A schema designed for training, evaluation, analytics, and attribution rather than a raw dump of vendor-specific logs. ## Schema Design The [schema](/docs/schema/overview) is a standalone package and the contract between capture, review, export, and downstream consumers. - Training: normalized steps, tool calls, observations, reasoning, outcomes - Analytics: token counts, cost estimates, timing, cache behavior - Attribution: git links and file or line provenance when available - Interop: export paths for ATIF and Agent Trace style consumers ## Start Here | Section | What's inside | |---------|---------------| | **[Installation](/docs/getting-started/installation)** | Install, verify, upgrade, uninstall | | **[Authentication](/docs/getting-started/authentication)** | OAuth, PATs, `HF_TOKEN`, auth precedence | | **[Quick Start](/docs/getting-started/quickstart)** | Initialize a repo, review traces, upload your first shard | | **[Commands](/docs/cli/commands)** | Current 0.3 command reference | | **[Inbox & Review](/docs/workflow/review)** | Web viewer, TUI, and CLI review loop | | **[Push](/docs/workflow/pushing)** | Upload behavior, remotes, visibility, migration, quality badges | | **[Security Tiers](/docs/security/tiers)** | Regex, entropy, TruffleHog, Tier 2 review, human approval | | **[Security Configuration](/docs/security/configuration)** | Global config, project marker, exclusions, custom redaction | | **[Schema](/docs/schema/overview)** | Trace structure and field semantics | | **[Consume](/docs/workflow/consume)** | Loading datasets back out of Hugging Face | --- # Installation ## pipx ```bash pipx install opentraces ``` ## brew ```bash brew install JayFarei/opentraces/opentraces ``` ## skills.sh ```bash npx skills add jayfarei/opentraces ``` Installs the opentraces skill via [skills.sh](https://skills.sh) so your coding agent can drive the init, review, and push workflow conversationally. `opentraces init` also installs the bundled skill into the current project. ## Copy to your agent Paste this into your coding agent (Claude Code, Cursor, Codex, etc.): ``` {{AGENT_PROMPT}} ``` The agent installs the CLI, authenticates, and initializes. `init` handles the skill installation automatically. After that the agent uses the skill file for everything else. ## From Source ```bash git clone https://github.com/JayFarei/opentraces cd opentraces python3 -m venv .venv source .venv/bin/activate pip install -e packages/opentraces-schema pip install -e ".[dev]" ``` ## Verify Installation ```bash opentraces --version opentraces --help ``` ## System Requirements | Platform | Status | |----------|--------| | macOS (ARM64, x86_64) | Supported | | Linux (x86_64, ARM64) | Supported | | Windows (WSL) | Supported via Linux binary | Python 3.10 or later is required. ## Upgrading From inside an initialized project, the preferred path is: ```bash opentraces setup upgrade ``` This detects whether you installed via `pipx`, Homebrew, pip, or source, upgrades the CLI, and refreshes the project skill or capture hook files. Outside a project context, upgrade with the package manager you originally used: ```bash pipx upgrade opentraces # or brew upgrade JayFarei/opentraces/opentraces # or pip install --upgrade opentraces ``` ## Uninstalling ```bash pipx uninstall opentraces # or brew uninstall opentraces # or pip uninstall opentraces ``` To also remove local data and credentials: ```bash rm -rf ~/.opentraces ``` --- # Authentication opentraces publishes to HuggingFace Hub. You need an HF account. ## Browser Login ```bash opentraces auth login ``` By default, `opentraces auth login` starts Hugging Face's OAuth device flow in your browser. This is the simplest path on a developer machine and supports the normal dataset workflow. The CLI requests the scopes it needs to read, create, push, delete, and change visibility on datasets in namespaces you belong to. ## Token Login ```bash opentraces auth login --token ``` Use a personal access token when you are headless, on CI, or cannot complete the browser flow. Generate the token at [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens). ## Environment Variable ```bash export HF_TOKEN=hf_... ``` The CLI checks for `HF_TOKEN` automatically. Useful in CI pipelines where interactive login isn't available. ## Auth Precedence 1. `HF_TOKEN` environment variable 2. Stored credentials from `opentraces auth login` ## Verify ```bash opentraces auth whoami ``` Shows your authenticated HuggingFace username. ## Logout ```bash opentraces auth logout ``` Clears stored credentials from `~/.opentraces/credentials`. --- # Quick Start From local capture to a published Hugging Face shard. ## 1. Install ```bash pipx install opentraces ``` ## 2. Set Up ```bash opentraces setup ``` `setup` is the machine-wide wizard. It walks each integration with one prompt, defaults in brackets: - **claude-code, git, skill** capture hooks [yes] — Stop/PostCompact hooks, post-commit correlator, and the Claude Code skill. - **watcher** [yes] — background incremental backfill after each commit, powers `opentraces blame`. - **entity-parser (sem)** [yes] — entity-level diffs for richer commit attribution. - **HuggingFace login** [yes] — device-code flow, needed before you can push. You can defer and run `opentraces auth login` later. - **trufflehog** (Tier 1.5) [no] — global secret-scanner toggle; findings redact in place and force review. - **llm-review** (Tier 2) [no] — global toggle for third-party LLM review; configure provider via `opentraces setup llm-review`. Per-project review policy and remote are not set here, they live in `opentraces init`. ## 3. Initialize the Project ```bash opentraces init ``` `init` wires the current repo into opentraces and prompts you for: - **Agents** to connect (e.g. `claude-code`). - **Review policy** — `review` every trace in the inbox, or `auto` (capture, sanitize, stage, push without review). - **HuggingFace login**, if you skipped it during setup. - **Remote dataset** — pick an existing `owner/repo` from your HuggingFace datasets, **create a new one** (init actually calls `create_repo` on the spot so you catch namespace errors now, not at first push), or skip for later. When creating, you also choose **visibility** (private by default). - **Import existing traces** — if this repo already has Claude Code sessions, `init` asks whether to import them now or start fresh. It also writes the committable marker at `.opentraces.json`, registers machine-local storage under `~/.opentraces/projects//`, and installs the per-repo capture hook unless you pass `--no-hook`. ## 4. Inspect the Inbox ### Web inbox ```bash opentraces web ``` The browser inbox shows each trace with timeline, review, and push flows. It is the richest surface for manual review and redaction. ![Web inbox - review view](/docs/assets/web-review.png) ![Web inbox - graph view](/docs/assets/web-graph.png) ### Terminal inbox ```bash opentraces tui ``` The TUI is faster for shell-first review. It loads the same local inbox and exposes staging, rejection, discard, security details, and push. ![Terminal inbox](/docs/assets/tui.png) CLI review is available too: ```bash opentraces status opentraces list --stage inbox opentraces show opentraces redact ``` ## 5. Stage Traces For Upload ```bash opentraces add --all ``` `add` moves Inbox traces into the visible `staged` set. `blocked` and `rejected` traces are refused until you fix or explicitly reject them. ## 6. Push ```bash opentraces push ``` `push` uploads staged traces to the active remote as a new JSONL shard and refreshes the dataset card. By default it also runs quality scoring unless you pass `--no-assess`. ## What Happens Next Your traces are available as a Hugging Face dataset: ```python from datasets import load_dataset ds = load_dataset("your-name/opentraces") ``` ## Next Steps - [Inbox & Review](/docs/workflow/review) - Web, TUI, and CLI review flows - [Push](/docs/workflow/pushing) - Remotes, visibility, migrations, and gates - [Security Tiers](/docs/security/tiers) - Review policy and layered scanning - [CLI Reference](/docs/cli/commands) - Full 0.3 command surface --- # Commands Reference for the current 0.3 `opentraces` CLI. ## Root Command ```bash opentraces [--json] ... ``` Use `--json` on any command when you want machine-readable output instead of the TTY view. The current public root commands are: | Command | What it does | |---------|---------------| | `auth` | Log in to Hugging Face, log out, or inspect the active identity | | `init` | Initialize opentraces in the current repo | | `remove` | Remove opentraces from the current repo | | `status` | Show a project snapshot and recent traces | | `list` | List traces, or list initialized projects with `--projects` | | `show` | Show one trace in detail | | `add` | Stage Inbox traces for the next push | | `reject` | Mark a trace local-only | | `reset` | Move a trace back to Inbox | | `redact` | Rewrite sensitive text in a trace | | `discard` | Permanently delete a local trace | | `push` | Upload staged traces to Hugging Face Hub | | `pull` | Import traces from a Hugging Face dataset | | `llm-review` | Run Tier 2 semantic review over traces | | `assess` | Score trace quality locally or on a dataset | | `web` | Open the browser inbox UI | | `tui` | Open the terminal inbox UI | | `blame` | Show per-commit attribution for a SHA (optionally one file) | | `graph` | Render commit + trace history (commit-primary or trace-primary) | | `resume` | Resume the upstream agent session behind a trace | | `export` | Export staged traces to another format | | `log` | List uploaded traces grouped by date | | `stats` | Show aggregate inbox statistics | | `remote` | Manage dataset remotes | | `config` | Show or set config values | | `setup` | Install integrations like hooks, TruffleHog, and llm-review | | `doctor` | Check security pipeline and integration health | | `completions` | Print or install shell completions | Three more commands are available but intentionally omitted from the default `--help` listing because they are advanced or typically driven by other surfaces: `backfill` (called by `watcher` and `init --import-existing`), `git-backfill` (retroactively correlates inbox traces to past commits after first install of the post-commit hook), and `watcher` (managed by `setup watcher`). All three are documented under [Advanced Commands](#advanced-commands) below. ## Authentication ### `opentraces auth` ```bash opentraces auth whoami opentraces auth login opentraces auth logout ``` Subcommands: - `login` starts the browser device flow by default - `login --token` switches to the CI / headless flow and prompts for a PAT (the flag is a boolean, not a value) - `whoami` reports the active HuggingFace identity - `logout` clears the stored Hugging Face credential ### `opentraces auth login` ```bash opentraces auth login opentraces auth login --token ``` | Flag | Description | |------|-------------| | `--token` | Boolean. Switches to the headless flow and prompts for a PAT instead of opening the browser device flow | ## Project Commands ### `opentraces init` ```bash opentraces init opentraces init --review-policy review opentraces init --review-policy auto opentraces init --remote owner/my-traces --public opentraces init --import-existing ``` Initializes the current repo, writes `.opentraces.json`, registers machine-local state under `~/.opentraces/projects//`, and installs the capture hook unless you pass `--no-hook`. | Flag | Description | |------|-------------| | `--agent [claude-code]` | Agent runtime to connect | | `--no-hook` | Skip Claude Code hook installation | | `--import-existing / --start-fresh` | Backfill existing Claude Code traces for this repo, or start from the next run | | `--review-policy [review|auto]` | Whether safe traces require manual review | | `--remote TEXT` | Hugging Face dataset repo in `owner/name` form | | `--private / --public` | Default visibility when creating the remote | ### `opentraces remove` ```bash opentraces remove opentraces remove --all ``` Uninstalls the capture hook, deletes the `.opentraces.json` marker, unregisters the repo from the global registry, and removes the machine-local `~/.opentraces/projects//` directory. Pushed datasets on Hugging Face are left untouched. | Flag | Description | |------|-------------| | `--all` | Also delete the audit ref (`refs/opentraces/audit/*`) and the trace-to-commit notes (`refs/notes/opentraces`) from this repository | ### `opentraces status` ```bash opentraces status opentraces status --limit 0 ``` Shows stage counts, the active remote, and recent traces (default limit `10`). | Flag | Description | |------|-------------| | `--limit INTEGER` | How many recent traces to show, `0` for all. Default `10`. | ## Review And Inbox Commands ### `opentraces list` ```bash opentraces list opentraces list --stage inbox opentraces list --remote origin opentraces list --projects opentraces list --by-commit ``` | Flag | Description | |------|-------------| | `--projects` | List initialized projects instead of traces | | `--remote TEXT` | Filter to traces missing on the named remote | | `--stage TEXT` | Filter by visible stage | | `--model TEXT` | Filter by model | | `--agent TEXT` | Filter by agent | | `--limit INTEGER` | Max rows to show | | `--by-commit` | Group results by commit | Visible stages are `inbox`, `staged`, `pushed`, `rejected`, and `blocked`. ### `opentraces show` ```bash opentraces show opentraces show --verbose opentraces show --markdown ``` `show` prints the trace prompt, steps, tool calls, observations, and outcome. Human output truncates long step content unless you pass `--verbose`. | Flag | Description | |------|-------------| | `--verbose` | Show full step content | | `--markdown` | Emit the trace wrapped for safe LLM handoff | ### `opentraces add` ```bash opentraces add opentraces add abc12 def34 opentraces add --all ``` Stages Inbox traces for the next push. | Flag | Description | |------|-------------| | `--all` | Stage every Inbox trace | `add` refuses `blocked` and `rejected` traces. ### `opentraces reject` ```bash opentraces reject ``` Marks a trace local-only so it will not be pushed. ### `opentraces reset` ```bash opentraces reset ``` Moves a trace back to Inbox. ### `opentraces redact` ```bash opentraces redact opentraces redact "ACME_INTERNAL_TOKEN" opentraces redact "sk-[A-Za-z0-9]+" --regex opentraces redact "secret" --field observations --step 3 ``` Find and replace text in a stored trace. `PATTERN` is a required positional argument; without `--regex` it is treated as a literal string. | Flag | Description | |------|-------------| | `--regex` | Treat `PATTERN` as a regular expression instead of a literal string | | `--field TEXT` | Limit the rewrite to one field (e.g. `prompt`, `observations`, `outcome`) | | `--step INTEGER` | Limit the rewrite to a specific step index | ### `opentraces discard` ```bash opentraces discard opentraces discard --yes ``` Permanently deletes the local trace. | Flag | Description | |------|-------------| | `--yes` | Skip the interactive confirmation prompt | ### `opentraces web` ```bash opentraces web opentraces web --port 6060 --no-open ``` | Flag | Description | |------|-------------| | `--port INTEGER` | Port for the local web inbox | | `--no-open` | Do not open the browser automatically | ### `opentraces tui` ```bash opentraces tui opentraces tui --fullscreen opentraces tui --limit 0 ``` | Flag | Description | |------|-------------| | `--fullscreen` | Open directly into fullscreen inspect mode | | `--limit INTEGER` | Maximum traces to load, `0` for all | ## Push And Import ### `opentraces push` ```bash opentraces push opentraces push --private opentraces push --llm-review opentraces push --repo owner/team-traces opentraces push --no-assess ``` Uploads staged traces to Hugging Face Hub as a new shard. | Flag | Description | |------|-------------| | `--private` | Force private visibility | | `--public` | Force public visibility | | `--publish` | Change an existing private dataset to public without uploading | | `--gated` | Enable gated access on the dataset | | `--repo TEXT` | Destination repo, defaulting to `username/opentraces` | | `--assess / --no-assess` | Run quality scoring and include dataset-card badges | | `--llm-review` | Require a clean Tier 2 verdict on every staged trace | | `--no-trufflehog` | Skip Tier 1.5 TruffleHog for this push only | | `--migrate-remote / --no-migrate-remote` | Auto-migrate older-schema remote shards | | `-y, --yes` | Skip interactive prompts | ### `opentraces pull` ```bash opentraces pull owner/dataset --parser hermes opentraces pull owner/dataset --parser hermes --limit 10 --dry-run opentraces pull owner/dataset --parser hermes --auto ``` Imports traces from a Hugging Face dataset. | Flag | Description | |------|-------------| | `--parser TEXT` | **Required.** Import format parser, currently `hermes` | | `--subset TEXT` | Dataset subset or config | | `--split TEXT` | Dataset split, default `train` | | `--limit INTEGER` | Max rows to import, `0` for all | | `--auto` | Auto-commit imported traces | | `--dry-run` | Parse and report without writing | ### `opentraces export` ```bash opentraces export --format agent-trace opentraces export --format atif opentraces export --format atif --output /tmp/traces.jsonl ``` Exports staged traces to another format. If no traces are staged the command exits 0 with a notice and does not create the output file. | Flag | Description | |------|-------------| | `--format [atif|agent-trace]` | **Required.** Target format | | `--output PATH` | Output file path. Default `./opentraces-export.jsonl` | ## Quality And Security ### `opentraces assess` ```bash opentraces assess opentraces assess --judge --judge-model sonnet opentraces assess --dataset owner/team-traces opentraces assess --explain ``` Local mode assesses staged traces, falling back to all local traces if nothing is staged yet. | Flag | Description | |------|-------------| | `--limit INTEGER` | Max traces to assess | | `--dataset TEXT` | Assess a remote Hugging Face dataset | | `--judge / --no-judge` | Enable the LLM judge | | `--judge-model [haiku|sonnet|opus]` | Judge model | | `--dry-run` | Print the assessment only | | `--explain` | Show the rubric glossary and exit | ### `opentraces llm-review` ```bash opentraces llm-review opentraces llm-review --scope staged opentraces llm-review --trace 8a3f1c opentraces llm-review --dry-run ``` Runs Tier 2 semantic review using the provider configured by `opentraces setup llm-review`, unless you override it on the command line. | Flag | Description | |------|-------------| | `--api-format [openai-compat|ollama|anthropic|fake]` | Override the wire protocol | | `--model TEXT` | Override the model | | `--base-url TEXT` | Override the OpenAI-compatible base URL | | `--api-key-env TEXT` | Override the env var containing the API key | | `--scope [all|inbox|staged]` | Choose which traces to review | | `--trace TEXT` | Review specific trace IDs, repeatable | | `--limit INTEGER` | Cap the batch size | | `--dry-run` | Estimate token usage only | | `--force` | Re-review traces that already have a cached verdict | | `--context-file FILE` | Pass project context such as `README.md` or `AGENTS.md` | ### `opentraces doctor` ```bash opentraces doctor opentraces doctor --security ``` Checks configured integrations, versions, and security tiers. It exits non-zero when a required integration is broken. | Flag | Description | |------|-------------| | `--security` | Show only the security pipeline view | For the LLM trace review tier, `doctor` also surfaces the active setup: backend and model, endpoint URL, API format, whether the configured `api_key_env` variable is set, and the probe result (e.g. model count, unreachable reason, or `not found` when the configured model is missing from the endpoint's catalog). Use this to confirm that `opentraces setup llm-review` wrote the expected values before running `opentraces llm-review`. ## Remote Management ### `opentraces remote` ```bash opentraces remote list opentraces remote list -v opentraces remote add owner/dataset opentraces remote create owner/team-traces --private opentraces remote visibility owner/dataset --public opentraces remote remove owner/dataset opentraces remote remove owner/dataset --delete-remote --yes opentraces remote delete owner/dataset --yes ``` Subcommands: - `add` connects an existing dataset - `create` creates a new dataset and connects it - `list` shows connected remotes - `remove` disconnects a remote locally - `delete` deletes the remote dataset and disconnects it - `visibility` flips a remote between private and public Positional `REPO` is optional on `remove` and `delete` when exactly one remote is connected. #### `opentraces remote list` | Flag | Description | |------|-------------| | `-v` | Show full dataset URLs instead of the short `owner/name` form | #### `opentraces remote create` | Flag | Description | |------|-------------| | `--private / --public` | Visibility of the new dataset (default `--private`) | | `--gated` | Enable gated access on the new dataset | #### `opentraces remote visibility` | Flag | Description | |------|-------------| | `--private / --public` | Target visibility for the remote | #### `opentraces remote remove` | Flag | Description | |------|-------------| | `--delete-remote` | Also delete the upstream Hugging Face dataset, not just the local connection | | `--yes` | Skip the interactive confirmation prompt | #### `opentraces remote delete` | Flag | Description | |------|-------------| | `--yes` | Skip the interactive confirmation prompt | ## Configuration And Setup ### `opentraces config show` ```bash opentraces config show opentraces --json config show ``` Shows the effective config with secrets masked. ### `opentraces config set` ```bash opentraces config set classifier_sensitivity high opentraces config set custom_redact_strings ACME_INTERNAL_TOKEN --append opentraces config set excluded_projects /path/to/repo --append opentraces config set review_policy auto --project ``` | Flag | Description | |------|-------------| | `--project` | Write to `/.opentraces.json` | | `--global` | Write to `~/.opentraces/config.json` | | `--append` | Append to a list-typed key | Default scope is global. ### `opentraces setup` ```bash opentraces setup opentraces setup claude-code opentraces setup git opentraces setup trufflehog opentraces setup llm-review opentraces setup review-policy --auto opentraces setup upgrade ``` Current setup subcommands: - `claude-code` installs the capture hooks - `entity-parser` downloads and verifies the `ot-entities` binary - `git` installs the post-commit correlator hook - `llm-review` configures the Tier 2 reviewer - `review-policy` changes the repo's review policy - `skill` installs the opentraces skill globally - `trufflehog` enables Tier 1.5 TruffleHog - `upgrade` upgrades the CLI and refreshes project files - `watcher` installs or removes the background attribution watcher ### `opentraces setup trufflehog` ```bash opentraces setup trufflehog opentraces setup trufflehog --enable opentraces setup trufflehog --disable ``` | Flag | Description | |------|-------------| | `--enable` | Turn Tier 1.5 on, failing if the binary is not present | | `--disable` | Turn Tier 1.5 off | | `--project` | Scope the setting to the project marker instead of global config | Tier 1.5 findings are redacted in place and force human review before push. ### `opentraces setup llm-review` ```bash opentraces setup llm-review opentraces setup llm-review --api-format openai-compat --base-url http://localhost:11434/v1 --model gemma3n:e4b opentraces setup llm-review --disable opentraces setup llm-review --print ``` | Flag | Description | |------|-------------| | `--api-format [openai-compat|ollama|anthropic|fake]` | Reviewer transport | | `--base-url TEXT` | Base URL for OpenAI-compatible backends | | `--model TEXT` | Model name | | `--api-key-env TEXT` | Env var holding the API key | | `--timeout FLOAT` | Request timeout | | `--disable` | Turn llm-review off | | `--enable` | Turn llm-review on using the current config | | `--test` | Ping the endpoint without writing config | | `--print` | Print the effective config | | `--no-interactive` | Skip the preset picker | | `--project` | Scope the change to the project marker | ### `opentraces setup review-policy` ```bash opentraces setup review-policy --review opentraces setup review-policy --auto opentraces setup review-policy --print ``` `--auto` auto-approves safe traces into `staged`. Push remains explicit. | Flag | Description | |------|-------------| | `--review` | Set policy to `review` (manual approval required before push) | | `--auto` | Set policy to `auto` (safe traces are auto-staged; push remains explicit) | | `--print` | Print the current policy and exit without writing | | `--project` | Write to the project marker. Default for this command | ### `opentraces setup claude-code` ```bash opentraces setup claude-code opentraces setup claude-code --dry-run opentraces setup claude-code --remove ``` Installs (or removes) the Claude Code capture hooks into your Claude Code settings file. | Flag | Description | |------|-------------| | `--hooks-dir TEXT` | Directory to drop hook scripts into. Default `~/.claude/hooks/` | | `--settings-file TEXT` | Path to the Claude Code settings file. Default `~/.claude/settings.json` | | `--dry-run` | Print the planned hook changes without writing | | `--remove` | Uninstall previously-installed hooks | ### `opentraces setup entity-parser` ```bash opentraces setup entity-parser opentraces setup entity-parser --force ``` Downloads and verifies the `ot-entities` binary used to expand function/class changes in `blame` and `graph`. | Flag | Description | |------|-------------| | `--force` | Re-download even if the binary is already installed | ### `opentraces setup git` ```bash opentraces setup git opentraces setup git --remove ``` Installs the post-commit correlator hook that attributes commits to traces. | Flag | Description | |------|-------------| | `--remove` | Uninstall the hook | ### `opentraces setup skill` ```bash opentraces setup skill opentraces setup skill --harness claude-code opentraces setup skill --remove ``` Installs the `opentraces` skill so Claude Code (and compatible harnesses) can drive the CLI. | Flag | Description | |------|-------------| | `--harness TEXT` | Target harness, repeatable (e.g. `claude-code`). Defaults to every supported harness | | `--remove` | Uninstall the skill | ### `opentraces setup watcher` ```bash opentraces setup watcher opentraces setup watcher --interval 600 opentraces setup watcher --no-install opentraces setup watcher --uninstall ``` Installs (or removes) the background attribution watcher service. The watcher polls enlisted projects and runs `backfill` when new commits or Claude Code sessions appear. | Flag | Description | |------|-------------| | `--interval INTEGER` | Poll interval in seconds. Default `300` | | `--no-install` | Update config only; don't install the system service | | `--uninstall` | Remove the installed service | ### `opentraces setup upgrade` ```bash opentraces setup upgrade opentraces setup upgrade --skill-only ``` Upgrades the opentraces CLI and refreshes project-side files (skill, hooks) where relevant. | Flag | Description | |------|-------------| | `--skill-only` | Refresh only the installed skill, skip the CLI upgrade step | ## Advanced Commands The first commands in this section (`blame`, `graph`, `resume`, `stats`, `log`, `completions`) are part of the default `--help` listing. `backfill`, `git-backfill`, and `watcher` are not: they are real commands but are intentionally hidden from the default listing because they are usually driven by `watcher`, `setup git`, and `setup watcher` respectively. ### `opentraces blame` ```bash opentraces blame abc1234 # Commit-mode (bare SHA) opentraces blame c:abc1234 src/main.py # Commit-mode, single file opentraces blame abc1234 --lines # Per-line view opentraces blame t:4dccb032 # Trace-mode (canonical) opentraces blame s:92437382 --include-overlapping # Trace-mode (upstream session) opentraces blame abc1234 --json # Structured output ``` Two modes, one argument: - **Commit-mode** (`c:` or bare SHA): which traces contributed to this commit. Uses the attribution cache for per-line detail and merges `refs/notes/opentraces` so hook-linked traces surface even when the attribution cache has no per-line data for that commit. - **Trace-mode** (`t:`, `s:`, or a bare hyphenated UUID): which commits carry this trace's output. Merges attribution-cache rows (fine-grained) with the trace's `git_links` (hook-linked). Hook-linked rows carry only a tier badge, not per-line counts. Commit-mode requires a populated attribution cache — run `opentraces backfill` if empty. Trace-mode works from `git_links` alone, so it surfaces hook-linked commits even when per-line attribution hasn't been computed yet. | Flag | Description | |------|-------------| | `--lines` | Per-line output (git-blame-style). Commit-mode only. | | `--entities` | Expand entity changes (functions, classes) under each trace. Commit-mode only. | | `--include-overlapping` | Trace-mode: include commits where files and timestamps overlap without direct tool-emit evidence. Off by default. | | `--project DIRECTORY` | Project directory, default CWD | | `--json` | Emit structured JSON instead of text | | `--no-color` | Disable ANSI colors | ### `opentraces graph` ```bash opentraces graph opentraces graph --limit 50 opentraces graph --trace abc12 opentraces graph --since HEAD~20 --until HEAD ``` Renders commit + trace history. Commit-primary by default: the git log is the spine and each commit shows the traces that touched it. Pass `--trace ` to pivot to trace-primary mode. Requires a populated attribution cache. | Flag | Description | |------|-------------| | `--limit INTEGER` | Commits per page. Default `20`. | | `--page INTEGER` | Page number (1-indexed). | | `--all` | Disable pagination (large `--limit`). | | `--trace TEXT` | Pivot to trace-primary mode for the given trace id. | | `--since TEXT` | Show commits after this ref. | | `--until TEXT` | Show commits up to this ref. | | `--project DIRECTORY` | Project directory, default CWD. | | `--entities` | Include entity-change suffixes (requires entity cache). | | `--no-color` | Disable ANSI colors. | ### `opentraces backfill` ```bash opentraces backfill opentraces backfill --rebuild opentraces backfill --dry-run ``` Backfills per-commit attribution into the local cache. Walks new commits since the last bookmark, correlates them to traces, and populates entity data when the `ot-entities` binary is available. | Flag | Description | |------|-------------| | `--dry-run` | Compute coverage without writing cache files. | | `--rebuild` | Clear the cache and re-attribute from HEAD. | | `--since TEXT` | Start from this ref instead of the bookmark (currently forces `--rebuild`). | | `--project DIRECTORY` | Project directory, default CWD. | | `--max-commits INTEGER` | Cap on commits to walk when rebuilding. Default `500`. | | `--json` | Emit a JSON payload instead of the human summary. | | `-v, --verbose` | Forward verbose logging to the audit builder. | | `--no-entities` | Skip the entity-parser pass (attribution only). | ### `opentraces git-backfill` ```bash opentraces git-backfill opentraces git-backfill --max-commits 2000 --window-hours 48 opentraces git-backfill --json ``` Retroactively correlates inbox traces to past commits. Useful after a first-time install of the post-commit hook (the hook only sees commits after install) or after a period where the hook failed silently. Walks first-parent history, re-runs the live correlator, writes `refs/notes/opentraces`, and persists `git_links` onto each trace's JSONL file. Safe to re-run: notes dedupe on append and `git_links` dedupe before rewrite. | Flag | Description | |------|-------------| | `--project DIRECTORY` | Project directory, default CWD | | `--max-commits INTEGER` | Cap on first-parent commits to walk. Default `500`. | | `--window-hours FLOAT` | Match a trace to a commit if `timestamp_end` is within this many hours of the commit's date (either side). Default `24.0`. | | `--json` | Emit a JSON payload instead of the human summary | ### `opentraces watcher` ```bash opentraces watcher start opentraces watcher start --interval 600 --no-install opentraces watcher status opentraces watcher status --json opentraces watcher tick opentraces watcher tick --project /path/to/repo --json opentraces watcher stop opentraces watcher restart opentraces watcher uninstall ``` Manages the background attribution watcher service (installed by `opentraces setup watcher`). The watcher polls enlisted projects and runs incremental `backfill` when new commits or Claude Code sessions appear. Subcommands: `start`, `stop`, `restart`, `status`, `tick` (one diagnostic pass), `uninstall`. #### `opentraces watcher start` | Flag | Description | |------|-------------| | `--interval INTEGER` | Poll interval in seconds. Default `300` | | `--no-install` | Start in foreground only; don't register the system service | #### `opentraces watcher status` | Flag | Description | |------|-------------| | `--json` | Emit structured JSON instead of the human summary | #### `opentraces watcher tick` | Flag | Description | |------|-------------| | `--project DIRECTORY` | Tick only the given project. Default: every enlisted project | | `--json` | Emit structured JSON instead of the human summary | `stop`, `restart`, and `uninstall` take no flags beyond `--help`. ### `opentraces resume` ```bash opentraces resume opentraces resume --dry-run opentraces resume --at-step s42 ``` Resumes the upstream agent session behind a trace. Accepts a full `trace_id` or a `t:XX` / `XX` prefix (2+ chars). For claude-code the command execs `claude --resume `; other agents print the native resume command instead. | Flag | Description | |------|-------------| | `--at-step TEXT` | Fork a new Claude Code session from a specific step id (e.g. `s42`). | | `--dry-run` | Print the resume command instead of exec'ing it. | ### `opentraces stats` ```bash opentraces stats ``` Rolls up every local trace into counts, token totals, cost estimates, and a model breakdown. ### `opentraces log` ```bash opentraces log opentraces log --verbose opentraces log --limit 0 ``` Lists the recent traces that have been pushed, grouped by date. Only the `pushed` stage is walked, so in-progress Inbox or staged work is ignored. Default output is one row per day with the push count, the destination remote(s), and the local time range of pushes: ``` 2026-04-16 6 pushed → origin 10:36–17:44 2026-04-15 1 pushed → origin 11:42 ``` `--verbose` / `-v` expands each day into per-trace rows with a short trace id, push time, model, and the first line of the task description, with total tokens and an estimated cost per day. The verbose view reads each trace file so it is slower on large inboxes: ``` 2026-04-16 6 pushed → origin 10:36–17:44 (430.7k tokens, ~$225.90) 785ddc93 10:36 opus-4-6 fix(cli): restore opentraces log… [62.6k tokens] 2cfe7e14 10:36 opus-4-6 docs: audit commands reference [203.0k tokens] … ``` | Flag | Description | |------|-------------| | `--limit INTEGER` | Max days of history to show. `0` for no limit. Default `30`. | | `-v, --verbose` | Expand each day into per-trace rows with model, token totals, and task description | ### `opentraces completions` ```bash opentraces completions install opentraces completions install zsh --alias otd opentraces completions install zsh --alias otd --alias ot --quiet opentraces completions uninstall opentraces completions uninstall zsh --quiet ``` Prints or installs shell completion scripts. Both `install` and `uninstall` take an optional positional shell name (`bash`, `zsh`, or `fish`); if omitted, the current shell is detected automatically. #### `opentraces completions install` | Flag | Description | |------|-------------| | `--alias NAME` | Also bind completion to `NAME`. Repeatable (e.g. `--alias otd --alias ot`) | | `-q, --quiet` | Suppress the confirmation output | #### `opentraces completions uninstall` | Flag | Description | |------|-------------| | `-q, --quiet` | Suppress the confirmation output | --- # Supported Agents 0.3 separates live capture from import support. ## Current Support | Mode | Identifier | Status | Notes | |------|------------|--------|-------| | Live capture | `claude-code` | Supported | Installed via `opentraces init` or `opentraces setup claude-code` | | Dataset import | `hermes` | Supported | Used with `opentraces pull --parser hermes` | Planned adapters can follow the same contracts without changing the inbox, push, or schema layers. ## Live Capture vs Import Live capture adapters discover and parse session files on disk. Import adapters read external datasets or files and map them into `TraceRecord`. That distinction matters in the public CLI: ```bash opentraces init --agent claude-code opentraces pull owner/dataset --parser hermes ``` ## Adapter Contracts The capture layer exposes small protocols: - `SessionParser` for live agent session parsing - `FormatImporter` for file or dataset imports - `HookInstaller` for external integrations like Claude Code and git This is why review, security, and push stay consistent even as new sources are added. ## What Gets Normalized All supported sources are normalized into the same schema with: - trace-level metadata - steps and reasoning content - tool calls and observations when the source provides them - outcomes and metrics - attribution and git links when available --- # Troubleshooting ## First Checks Start with: ```bash opentraces status opentraces doctor opentraces doctor --security ``` `status` tells you what the current repo thinks is staged or waiting. `doctor` tells you whether required integrations are misconfigured. ## Common Problems ### Not Initialized If the CLI says the repo is not initialized, run: ```bash opentraces init ``` The current repo marker is `.opentraces.json`, not `.opentraces/config.json`. ### No Traces Showing Up Check: ```bash opentraces status opentraces list --stage inbox opentraces setup claude-code ``` If you are using Claude Code, make sure the capture hooks are installed and that the repo has actual Claude Code session files under `~/.claude/projects/`. ### Blocked Traces Inspect them with: ```bash opentraces list --stage blocked opentraces show ``` Then either redact, reset, or reject as needed: ```bash opentraces redact opentraces reset opentraces reject ``` ### Push Fails Common causes: - no Hugging Face auth - no remote configured - `--llm-review` requested but staged traces do not have clean verdicts - a configured integration is broken Useful commands: ```bash opentraces auth whoami opentraces remote list opentraces llm-review --scope staged opentraces doctor ``` ### TruffleHog Enabled But Missing If `doctor` reports that TruffleHog is enabled but unavailable: ```bash opentraces setup trufflehog # or opentraces setup trufflehog --disable ``` ### LLM Review Unreachable First check what `doctor` sees: ```bash opentraces doctor --security ``` The LLM trace review row shows the configured backend and model, the endpoint URL, whether the `api_key_env` variable is set, and the probe result against that endpoint. Common signals: - `probe: ... not found` — the configured model is not in the endpoint's catalog (pull it or update the model name) - `probe: UNREACHABLE ...` — the endpoint did not respond (start the local server, check the URL) - `api key env: $VAR (unset)` — remote backend needs an API key that is not exported in this shell Re-test or reconfigure it: ```bash opentraces setup llm-review --test opentraces setup llm-review opentraces setup llm-review --disable ``` ### Resetting A Repo To remove opentraces from the current repo cleanly: ```bash opentraces remove opentraces remove --all ``` To clear the stored Hugging Face credential: ```bash opentraces auth logout ``` --- # Parsing Parsing is the ingestion step that turns raw agent logs or imported datasets into local `TraceRecord` JSONL files under `~/.opentraces/projects//traces/`. ## What Runs Automatically When `opentraces init` installs the Claude Code hook, capture runs automatically after each session ends. The capture path: 1. Finds new Claude Code session files under `~/.claude/projects/` 2. Parses the raw session into a `TraceRecord` 3. Filters out trivial traces with fewer than 2 steps or no tool calls 4. Runs the enrichment and security pipeline 5. Writes the result into the project's machine-local trace store 6. Updates local state so the trace surfaces as `inbox`, `staged`, `rejected`, `pushed`, or `blocked` ## Enrichment Pipeline Every parsed trace is enriched before staging: | Step | What it does | Example output | |------|-------------|----------------| | Git signals | Detects repo state and later correlates commits back to traces | active branch, git links, lifecycle | | Attribution | Maps Edit and Write tool calls to file and line ranges when possible | `auth.py L42-67` attributed to step 4 | | Dependencies | Extracts from manifests and install commands | `["flask", "pydantic"]` from `pyproject.toml` | | Metrics | Aggregates token counts, cost, cache rates | `cache_hit_rate: 0.91`, `estimated_cost_usd: 3.21` | | Security scan | Regex + entropy scan, optional TruffleHog, redaction | sensitive strings rewritten before review | | Anonymization | Normalizes usernames and local paths | `/Users/alice/project/` becomes a sanitized path | ## Attribution: the three-layer pipeline Attribution is built by three resolvers tried in priority order. The strongest available signal wins per range. 1. **PostToolUse hook** (`src/opentraces/capture/claude_code/hooks/on_tool_use.py`). Fires after every Edit/Write, reads the file from disk, and emits a transcript event with the exact post-edit lines plus a `murmur3:<32-hex>` content hash. This is the authoritative signal — `experimental` stays `false`. 2. **Unified diff.** When no hook event covers a range, the trace's unified diff is parsed to recover line numbers and content. Medium confidence. 3. **`str.find` fallback.** Last-resort textual match of tool output back to the current file content. Low confidence; the resulting `attribution.experimental` is `true`. The PostToolUse hook is installed alongside the trace-end capture hook by `opentraces init` (and can be reinstalled with `opentraces setup claude-code`). Its events are consumed at parse time, so the post-edit hashes travel with the trace even if the file is later reformatted. This lets the post-commit correlator match ranges across formatter churn and classify the resulting `GitLink` tier. ## Review Policy Interaction `review_policy` controls where a parsed trace lands: | Policy | Result | |--------|--------| | `review` | Trace lands in `Inbox` for manual review | | `auto` | Clean traces are auto-approved into `staged` | The review surface still exists either way. `blocked` traces and traces with findings still need human attention. ## Parsing Existing Traces To import traces that were recorded before you ran `opentraces init`, pass `--import-existing` at init time: ```bash opentraces init --import-existing ``` This runs a one-off batch parse of all existing Claude Code traces for the current project directory, applying the same enrichment and security pipeline as the hook. For dataset imports instead of live capture, use: ```bash opentraces pull owner/dataset --parser hermes opentraces pull owner/dataset --parser hermes --auto opentraces pull owner/dataset --parser hermes --limit 10 --dry-run ``` `pull` routes imported records through the same staging and security flow before they appear locally. ## What Gets Filtered - Traces with fewer than 2 steps - Traces with zero tool calls - Duplicate traces by `content_hash` - Parse outcomes with errors are marked `blocked` ## Next Step ```bash opentraces web ``` Use the browser inbox or `opentraces tui` to review traces before staging them for push. --- # Inbox The inbox is where traces are reviewed before upload. In 0.3 the public review surface is `web`, `tui`, and the flat CLI commands like `list`, `show`, `add`, `reject`, `reset`, `redact`, and `discard`. ## Web Inbox ```bash opentraces web opentraces web --port 6060 --no-open ``` `web` starts the local Flask server for the current project's inbox and opens the React viewer at `http://127.0.0.1:6000` (override with `--port`, pass `--no-open` to skip the browser launch). It is the richest review surface, with side-by-side trace inspection and a built-in push flow. ### Review tab ![Web inbox - review view](/docs/assets/web-review.png) The **review** tab shows the Inbox / Staged / Pushed columns on the left and the selected trace on the right. Switch between the `conversation` and `blame` tabs at the top of the preview to flip between the flattened chat stream and the commit-blame view for that trace. - `j` / `k` — move the inbox selection up / down - `space` — add the selected inbox trace to Staged, or remove it from Staged - `r` — refresh the inbox from the session files on disk - `?` — toggle the review help overlay (also shows the row-legend) - `q` — quit the local server (browser tab closes automatically) Per-row actions are visible on hover: - `+` — stage an inbox trace - `✕` — reject an inbox trace (kept local only) - `−` — unstage a staged trace back to the inbox - `i` — open the security-pipeline modal for that trace The **Push** button at the top of the Staged column opens the push modal. You can push directly, or run an optional Tier-2 LLM review first (requires `opentraces setup llm-review`). The header also exposes a global `i` (project-wide security info) and `?` (help). ### Graph tab ![Web inbox - graph view](/docs/assets/web-graph.png) The **graph** tab is the blame surface. It lists recent commits on the left; selecting one shows every trace that contributed lines to that commit, plus a per-file breakdown with attributed line counts. This is how you answer "which trace produced this code?" at commit-granularity. - `j` / `k` — move the commit selection - `enter` — jump to the blamed trace in the review tab - `q` — quit ## Terminal Inbox ```bash opentraces tui opentraces tui --fullscreen opentraces tui --limit 0 ``` The TUI is the shell-native inbox. It loads the same trace set and the same stage model (Inbox / Staged / Pushed) as the web viewer, and exposes trace detail, security status, staging, rejection, discard, and push without leaving the terminal. ![Terminal inbox](/docs/assets/tui.png) The layout is two columns — Info / Inbox / Staged / Pushed on the left, the selected trace's preview on the right. Numeric keys focus a pane directly. ### Navigation - `1` / `2` / `3` / `4` — focus Info / Inbox / Staged / Pushed - `5` — focus the trace preview - `tab` — cycle focus across the panes - `j` / `k` (or `↑` / `↓`) — move selection - `enter` — inspect the selected trace (focus the preview) - `g` / `G` — jump to top / bottom of the preview - `[` / `]` — page the preview up / down from any pane - `a` — toggle conversation view vs. full view ### Actions - `space` — add inbox→staged, or remove staged→inbox - `p` — open the push modal (LLM review or push now) - `r` — refresh (re-capture and reload) - `d` — discard the selected trace (deferred; actually deleted on quit) - `u` — undo the last reject / discard / stage move - `i` — open the security-pipeline modal for the selected trace - `?` — toggle the full help overlay - `q` — quit (flushes pending discards) ### Trace row legend - `·` — normal trace - `◐` (dim cyan) — recently touched in roughly the last 2 hours - `●` (yellow) — security findings still need review - `●` (red) — blocked trace - `↑N` (dim cyan) — session generation; `↑1` is the first captured trace for that session, `↑2+` means the same session kept going and this newer trace replaces an older one. Refresh pulls the latest; review and push the latest generation. ## CLI ```bash opentraces list opentraces list --stage inbox opentraces list --by-commit opentraces show opentraces show --verbose opentraces show --markdown opentraces add opentraces add --all opentraces reject opentraces reset opentraces redact --step 3 opentraces discard --yes ``` Use the CLI when you want scriptable review or a precise edit loop: - `list` filters the local inbox by stage, model, agent, remote, or commit grouping - `show` prints the full trace detail, with `--verbose` to remove the default 500 character truncation - `show --markdown` wraps the trace for safe handoff to another LLM - `add` stages upload-eligible traces - `reject` keeps a trace local only - `reset` moves a trace back to Inbox - `redact` rewrites the stored trace JSON in place - `discard` permanently deletes the local trace ## Stage Vocabulary | Stage | Meaning | |-------|---------| | `inbox` | Needs review | | `staged` | Ready for the next push | | `pushed` | Published upstream | | `rejected` | Kept local only | | `blocked` | Needs action before it can be staged | Internally the state machine tracks additional states. The public CLI and UIs collapse those down to the visible stages above. ## What To Look For - Secrets that escaped redaction - Internal hostnames and collaboration URLs - Customer names, paths, or identifiers - Traces that are too short or too trivial - Tool outputs that should be redacted before sharing ## Inbox Flow ```bash opentraces add opentraces add --all opentraces push ``` If you refreshed and a session produced a newer generation, stage and push the latest generation for that session. If you want a faster automatic path, set the project to auto-approve clean traces: ```bash opentraces setup review-policy --auto ``` That still does not push automatically. Upload remains explicit. --- # Assess `opentraces assess` scores trace quality against the current downstream-facing rubrics. ```bash opentraces assess opentraces assess --judge --judge-model sonnet opentraces assess --dataset owner/team-traces opentraces assess --explain ``` Local mode assesses staged traces first. If nothing is staged yet, it falls back to the local trace store so you can still inspect quality before deciding what to upload. ## Push Integration `opentraces push` runs assessment by default and embeds the resulting scorecard into the dataset card. Use `--no-assess` when you want to skip that pass for a particular push. ## Scoring Model Assessment is deterministic by default. The core score is computed from Python checks over the `TraceRecord` structure, without external calls or randomness. An optional LLM judge can add qualitative scoring: ```bash opentraces assess --judge opentraces assess --judge --judge-model haiku opentraces assess --judge --judge-model sonnet opentraces assess --judge --judge-model opus ``` ## Personas Every trace is scored across five consumer-facing personas: | Persona | What it checks | |---------|----------------| | Conformance | Schema correctness and structural completeness | | Training | SFT-readiness: dialogue quality, tool-call structure, usable reasoning | | RL | Outcome and reward-signal usefulness | | Analytics | Metrics, timing, cost, and observability coverage | | Domain | Metadata that makes the trace discoverable and reusable | Run `opentraces assess --explain` for the full glossary and threshold details exposed by the CLI. ## Remote Datasets To assess a dataset already on Hugging Face: ```bash opentraces assess --dataset owner/team-traces ``` This is independent of the current local inbox. ## Typical Flows ```bash opentraces add --all opentraces assess opentraces push ``` Or, when you want a stricter push gate: ```bash opentraces llm-review --scope staged opentraces push --llm-review ``` --- # Blame When agents write the code, `git blame` tells you the commit, not the prompt. `opentraces blame` closes that gap: given a commit it returns the sessions that produced the committed bytes; given a trace id it returns the commits that carry that session's output. See [How It Works](#how-it-works) for the mechanism. > **Experimental — not 100% accurate yet.** Attribution rests on three moving parts (capture hook, post-commit correlator, and watcher-driven audit refs) and any one of them falling behind produces incomplete or misleading results. Treat coverage numbers as best-effort until the pipeline stabilises, and cross-check against `opentraces doctor`, `opentraces graph`, and the raw `refs/notes/opentraces` notes when a number looks wrong. ## Prerequisites Blame needs three things to produce trustworthy output: 1. **Capture hook installed.** Claude Code sessions are captured automatically after `opentraces init` (or `opentraces setup claude-code`). Without this there is no audit ref to blame against. 2. **Post-commit hook installed.** This attaches a note under `refs/notes/opentraces` linking each commit to the contributing traces at the moment the commit lands. ```bash opentraces setup git ``` 3. **Watcher installed.** The background watcher polls the repo, rebuilds the attribution cache from the audit ref, and keeps the entity graph in sync. Running blame without the watcher means the cache can lag HEAD — percentages will read low, traces will read as orphans, and hook-linked commits will not pick up per-line attribution. Install it once per machine: ```bash opentraces setup watcher ``` `opentraces doctor` surfaces the watcher status alongside capture and git-hook state; if any of the three is red, blame output should be treated as preliminary. Old commits cannot be backfilled by the hook — the hook only sees commits after install. Two escape hatches exist: - `opentraces backfill --rebuild` clears the per-line attribution cache and re-attributes everything reachable from `HEAD` using the stored tool-call data. Run this after a rebase, squash, or any time the on-disk cache drifts from the audit ref. - `opentraces git-backfill` walks first-parent history, re-runs the live correlator, and writes `refs/notes/opentraces` + per-trace `git_links` for any old commits the hook missed. Useful after a first-time install of the post-commit hook, or after a period where the hook was silently failing (for example, the pre-`0.3.0` PATH-silent-failure bug). ## Graph View `opentraces graph` renders the git log as a spine. Each commit shows the sessions that contributed to it, with inline entity summaries and a coverage percentage. ```bash opentraces graph --limit 8 ``` ![opentraces graph --limit 8](/docs/assets/blame/graph-limit-8.png) Reading the spine: | Glyph | Meaning | |---|---| | `●` | Commit node | | `╭┄` / `├┄` | Session contributing to the next commit | | `├╯` | End of a commit's session group | | `c:` | Commit id (prefix-resolvable by `opentraces show`, `opentraces blame`) | | `s:` | Session id (trace prefix) | | `+N ~M -K fns` | Added / modified / deleted functions or entities | | `100%` | Fraction of the commit's diff covered by bytes recorded in the session's audit ref (Edit/Write tool calls plus reconstructed Bash effects) | Commits with no attached sessions (`c:7c3b1927 marketing skill`) appear as bare nodes — either pre-hook commits, or commits whose hunks came from mutations the reconstructor could not prove (see `pre-audit` under [How It Works](#how-it-works)). ### Graph flags ```bash opentraces graph --trace # Pivot to trace-primary view opentraces graph --since HEAD~20 --until HEAD # Scope by ref range opentraces graph --entities # Expand entity subline per session opentraces graph --all # Disable pagination ``` ## Blame for a Commit `opentraces blame ` resolves one commit to its contributing traces, with per-trace diff coverage, entity-level deltas, and per-file attribution counts. ```bash opentraces blame ac019172 ``` ![opentraces blame ac019172](/docs/assets/blame/blame-commit.png) The output is four sections: 1. **Commit header.** Overall coverage: how many diff lines map to any traced tool call, how many traces contributed, how many files were touched. 2. **Per-trace rows.** Each `◆ s:` row shows the session's short slug, the model, and its slice of the diff (` of diff lines . %`). Added/modified entities are listed inline. 3. **File list.** Every file in the commit with its attributed-vs-pre-audit line counts. `pre-audit` lines exist in the file but predate the attribution cache — they'll be fully attributed once `opentraces backfill --rebuild` runs. 4. **Attribution cache reference** (when `--json` is passed): the audit ref and revision so consumers can round-trip back to raw evidence. Traces that the hook linked but whose per-line attribution isn't in the cache yet appear in a separate **Hook-linked traces** block below the per-trace rows; run `opentraces backfill` to promote them into the main breakdown. ### Blame flags ```bash opentraces blame # Commit-scoped summary opentraces blame c: # Single-file slice opentraces blame --lines # Per-line (git-blame-style) opentraces blame --entities # Expand per-trace entity lists opentraces blame --json # Structured output for consumers ``` ## Blame for a Trace (inverse blame) Given a trace id instead of a commit, `opentraces blame` walks the relationship in the other direction: which commits carry this session's output. ```bash opentraces blame t:2cfe7e14 # Canonical (ingested) trace opentraces blame s:6606fc1f # Attribution-only session (upstream, pre-init, or forked) opentraces blame 2cfe7e14-…-full-uuid # Bare hyphenated UUID auto-detects opentraces blame t:2cfe7e14 --include-overlapping # Include weak file+time links opentraces blame t:2cfe7e14 --json # Structured output for consumers ``` The argument accepts either prefix form. `t:` resolves against canonical traces in the local inbox; `s:` resolves against the staging session ids or attribution-cache entries (useful for forks, or for sessions that never landed in the inbox). A bare hyphenated UUID auto-detects as a trace id; a bare hex string is treated as a commit first and falls back to trace resolution if the commit does not exist. Output is a trace header and a list of commits this trace contributed to: ![opentraces blame t:2cfe7e14](/docs/assets/blame/blame-trace.png) Rows with line-level attribution show real line counts and a coverage percentage; hook-linked rows (where the post-commit hook recorded a link but the attribution cache doesn't yet have per-line data) show a tier badge instead. `--include-overlapping` additionally shows commits with only a weak file+timestamp overlap — off by default because that's coincidence rather than contribution. `--lines`, `--entities`, and a `PATH` argument are commit-mode only; trace-mode output is always summary-level. ## Web Viewer `opentraces web` exposes the same blame data in the browser. Switch to the `graph` tab to browse the commit spine on the left and the per-commit blame on the right. ![opentraces web — graph / blame view](/docs/assets/blame/web-blame-view.png) The viewer is keyboard-first: `j`/`k` navigates commits, `enter` loads the blame panel, `q` quits. The trace-side panel mirrors the CLI, with hook-linked commits collapsed under a `▸ N hook-linked commits (no line counts)` disclosure so the primary list stays dense with line-attributed rows. ## Evidence Tiers Every `GitLink` from trace to commit is evidence-graded. Consumers can filter datasets to a tier floor and drop orphan traces. | Tier | Meaning | |---|---| | `tool_emitted` | Bytes recorded in the session's audit ref (from Edit/Write tool calls or reconstructed Bash effects) appear verbatim in the commit's staged hunks. Gold-standard signal. | | `tool_emitted_with_divergence` | File set lines up, but the committed bytes don't hash-match — a formatter, pre-commit hook, or human rewrote the output. Combine with `AttributionRange.original` for recovery. | | `overlapping` | File-set and time-window overlap only, no hash match. Treat as weakly linked. | | `orphan` | No viable commit link. Trace is kept, but don't claim authorship. | The tier appears in `git_links[].tier` on every trace and in the `--json` output of `blame` and `graph`. See [Outcome & Attribution](/docs/schema/outcome-attribution) for the full evidence model and RFC references. ## How It Works `opentraces blame` isn't a wrapper around `git blame`. It builds a parallel Git history — an *audit ref* — that records exactly what each session wrote, then blames against that. You don't need this section to use blame, but it helps when reading the raw refs, debugging coverage, or thinking about where semantic attribution is headed. ### Git in four primitives Git is four stacked concepts. Knowing them makes everything else obvious. | Primitive | What it is | |---|---| | **Blob** | File content plus a hash. No name, no metadata. Content-addressable, so identical bytes dedupe automatically. | | **Tree** | A directory snapshot — a list of `(name, mode, blob-or-tree-hash)` entries. | | **Commit** | A pointer to a root tree plus metadata (author, message, parent(s)). Commits form a DAG through their parents. | | **Reference** | A named pointer to a commit. `main`, `HEAD`, `refs/notes/*` — all just names; updating a branch means moving the pointer. | Git stores **snapshots, not diffs**. A diff is two trees compared on demand. That matters for attribution: we don't need a parallel database to track who wrote what — we can build one out of the same primitives and run existing Git tools against it. ### Why `git blame` alone isn't enough `git blame src/auth.py` tells you which commit last touched each line and who authored that commit. When an agent writes the code and a human commits it, blame still points at the human. The reasoning, the prompt, and the session context are all discarded at commit time. We need a second authorship layer: one where the author is the *session*, not the committer. ### A parallel audit history opentraces builds that second layer out of the same primitives: ``` main branch (refs/heads/main) c:abc123 "feat: auth flow" by alice c:def456 "fix: token refresh" by bob c:ghi789 "docs: update" by alice │ │ correlated via refs/notes/opentraces ▼ audit history (refs/opentraces/audit/) t:s1abc "Edit src/auth.py" by @opentraces.local t:s2def "Write src/token.py" by @opentraces.local t:s3ghi "Edit README.md" by @opentraces.local ``` Each time a session mutates a tracked file — through an `Edit`/`Write` tool call or through a Bash command whose effect the reconstructor can prove (redirects, heredocs, `mv`/`cp`/`rm`, `sed -i`, `echo`/`printf`/`cat` redirects) — the capture hook: 1. **Snapshot → blob.** Captures the file's post-mutation bytes. Content-addressed, so identical content never stores twice. 2. **Assemble → tree.** Combines touched files into a tree matching the project layout at that moment. 3. **Seal → commit.** Writes a synthetic commit authored by `@opentraces.local` to `refs/opentraces/audit/`. One commit per snapshot. 4. **Correlate → notes.** When a real commit lands on `main`, the post-commit hook from `opentraces setup git` writes a note to `refs/notes/opentraces` linking the real commit to the audit commits whose bytes appear in its staged hunks — whether the commit lands during the session or much later, as long as the session's stored bytes still show up. Bash mutations whose effect the reconstructor cannot prove deterministically (arbitrary scripts, binary producers, commands with external state) fall through to `pre-audit`: the file is tracked but the line's authorship is left unclaimed rather than fabricated. All four steps use native Git. Nothing lives in a parallel database, there is no custom file format, and no server roundtrip is required. `git log refs/opentraces/audit/` just works, and `git notes --ref=refs/notes/opentraces show ` shows the correlation directly. ### Blame derives from the audit ref With the audit graph in place, per-line attribution reduces to a familiar primitive: ```bash git blame --line-porcelain ``` ...run against the audit ref instead of `main`. Every line comes back attributed to the session that wrote it, because the author email is `@opentraces.local`. `opentraces blame` wraps this with the correlation from `refs/notes/opentraces` so you can start from either side — a commit SHA or a trace ID — and land on the other. The [evidence tiers](#evidence-tiers) above aren't subjective labels either: they're hash comparisons between the audit ref's tree and the real commit's tree. ### Where this is going: semantic attribution Line-level blame is the baseline. The next question — "did this *function* come from that session, even after it moved, got rebased, or was partially rewritten?" — is a three-way tree merge: - **base** = tree before the session ran - **ours** = base plus just that session's Edit ranges applied - **theirs** = the real committed tree The merge result tells you whether the committed code still carries the session's change, partially carries it (touched by a formatter, rebased, cherry-picked, or refactored), or diverged entirely. `AttributionRange.content_hash` is the hook we're preparing for this direction. ## Common Flows ### "Why did this line change?" ```bash git blame src/auth.py | head -5 # Find the commit opentraces blame src/auth.py # Find the session(s) opentraces show s: # Read the prompt + reasoning ``` ### "Which commits carry this session's output?" ```bash opentraces blame t: # Canonical inbox trace opentraces blame s: # Upstream / fork / pre-init opentraces blame t: --include-overlapping # Include weak file+time links ``` ### "Rebuild attribution after a rebase or squash" ```bash opentraces backfill --rebuild ``` This clears the cache and re-attributes every commit reachable from `HEAD` using the stored tool-call data. The underlying trace JSONL files are not modified — generations with the same `session_id` are replacement snapshots, not appends. ### "I just installed the post-commit hook, link my older commits" ```bash opentraces git-backfill opentraces git-backfill --max-commits 2000 --window-hours 48 ``` Walks first-parent history and retro-correlates inbox traces against each commit. Writes `refs/notes/opentraces` and persists `git_links` onto the staged trace JSONLs so old commits start showing up in `ot graph`, `ot blame c:`, and `ot blame t:`. Safe to re-run: notes dedupe on append and `git_links` dedupe before rewrite. ### "Filter a pushed dataset to tool-emitted traces" ```python from datasets import load_dataset ds = load_dataset("owner/my-traces", split="train") clean = ds.filter( lambda r: any(link["tier"] == "tool_emitted" for link in r.get("git_links", [])) ) ``` ## See Also - [Schema — Outcome & Attribution](/docs/schema/outcome-attribution) — `GitLink`, `Attribution.revision`, `AttributionRange` - [Schema — Versioning](/docs/schema/versioning) — schema 0.3.0 additive changes - [CLI Reference — `blame`, `graph`, `backfill`](/docs/cli/commands) - [Carol Nichols, "Taming Git complexity with Rust and Gitoxide" (FOSDEM 2026)](https://www.youtube.com/watch?v=iSAMvE3yzfc) — the four-primitive framing this page's "How It Works" section is built on. --- # Push `opentraces push` uploads staged traces to Hugging Face Hub as a new JSONL shard. It never appends to an existing shard in place. If nothing is staged yet, review first and run: ```bash opentraces add --all ``` ## Options ```bash opentraces push opentraces push --private opentraces push --public opentraces push --publish opentraces push --gated opentraces push --no-assess opentraces push --repo user/custom-dataset opentraces push --llm-review opentraces push --no-trufflehog ``` | Flag | Default | Description | |------|---------|-------------| | `--private` | off | Force private visibility | | `--public` | off | Force public visibility | | `--publish` | off | Change an existing private dataset to public | | `--gated` | off | Enable gated access on the dataset | | `--assess / --no-assess` | on | Run quality scoring and include badges in the dataset card | | `--llm-review` | off | Require a clean Tier 2 LLM verdict on every staged trace before upload | | `--no-trufflehog` | off | One-shot override: skip Tier 1.5 TruffleHog for this push only | | `--repo` | `{username}/opentraces` | Target HF dataset repo | | `--migrate-remote / --no-migrate-remote` | prompt | Auto-migrate older schema shards on the remote | | `-y, --yes` | off | Skip interactive prompts, including migration confirmation | `--approved-only` is not part of the 0.3 CLI. ## Security Gates Two optional gates can run at push time: - `--llm-review` blocks the upload unless every staged trace carries a clean completed Tier 2 verdict in `metadata.llm_review`. - `--no-trufflehog` is a one-shot escape hatch for projects where Tier 1.5 TruffleHog is enabled in config but you want to skip it just for this push. It does not change the persisted config. When `--llm-review` aborts, the CLI exits `3` and prints `opentraces llm-review` as the hint. ## How Upload Works Each push creates a new JSONL shard. Existing data is never overwritten or appended to. ```text data/ traces_20260329T142300Z_a1b2c3d4.jsonl traces_20260401T091500Z_e5f6a7b8.jsonl <- new shard from this push ``` That means: - Each push is atomic - No merge conflicts between contributors - Dataset history grows by shard ## Dataset Card `push` generates or updates a `README.md` dataset card on every successful upload. The card aggregates statistics across **all** shards in the repo, not just the current batch, so counts are always accurate. The card records: - schema version - trace counts, steps, and tokens - model and agent distribution - date range - average cost and success rate (when available) A machine-readable JSON block is embedded for programmatic consumers: ```html ``` ### Quality scorecard (`--assess`) Quality scoring is enabled by default during push. The resulting scorecard is embedded into the dataset card with badges, a persona breakdown, and a `quality.json` sidecar. Use `--no-assess` if you want to skip that pass for a particular upload. Here's what the scorecard looks like on a live dataset: [![Overall Quality 78.1%](https://img.shields.io/badge/Overall_Quality-78.1%25-ffc107)](https://opentraces.ai) [![Gate FAILING](https://img.shields.io/badge/Gate-FAILING-dc3545)](https://opentraces.ai) ![Conformance 88.4%](https://img.shields.io/badge/Conformance-88.4%25-28a745) ![Training 89.0%](https://img.shields.io/badge/Training-89.0%25-28a745) ![RL 73.4%](https://img.shields.io/badge/RL-73.4%25-ffc107) ![Analytics 55.7%](https://img.shields.io/badge/Analytics-55.7%25-fd7e14) ![Domain 84.1%](https://img.shields.io/badge/Domain-84.1%25-28a745) The scorecard embeds per-persona scores as shields.io badges, a breakdown table with PASS / WARN / FAIL per rubric, and a `quality.json` sidecar for machine consumers. See [Assess](/docs/workflow/quality) for scoring details. ## Visibility | Setting | Who Can See | Use Case | |---------|-------------|----------| | Private | Only you | Sensitive code or private experiments | | Public | Anyone | Open-source contributions | | Gated | Anyone who requests access | Controlled sharing | ## Push Behavior by Mode In `review` mode, every trace waits in Inbox until a human stages it. In `auto` mode, clean traces are auto-approved into the `staged` set. Push is still explicit. ## Remotes Use `opentraces remote` to manage which Hugging Face dataset this repo pushes to: ```bash opentraces remote list opentraces remote add owner/dataset opentraces remote create owner/team-traces --private opentraces remote visibility owner/dataset --public opentraces remote remove owner/dataset ``` `push --repo owner/dataset` is a one-shot override for the destination. The project's active remote remains unchanged unless you update it through `opentraces remote`. ## Export Export is now part of the public CLI for staged traces: ```bash opentraces export --format agent-trace opentraces export --format atif ``` `agent-trace` emits Agent Trace JSONL. `atif` is present but still a lighter path. Start from the schema docs if you need a custom converter. --- # Consume Once traces are on Hugging Face Hub, you can read them back as files or through the `datasets` library. ## File-Oriented Access [hf-mount](https://github.com/huggingface/hf-mount) exposes a dataset as a virtual filesystem. That works well for agents that prefer normal file operations. Install: ```bash curl -fsSL https://raw.githubusercontent.com/huggingface/hf-mount/main/install.sh | sh ``` Mount and inspect: ```bash hf-mount start repo datasets/your-org/agent-traces /mnt/traces ls /mnt/traces/data/ head -n 1 /mnt/traces/data/traces_*.jsonl ``` For private or gated datasets, authenticate first: ```bash hf auth login ``` Unmount when done: ```bash hf-mount stop /mnt/traces ``` ## Structured Access Use Hugging Face `datasets` for notebooks, analysis, or training pipelines. ```python from datasets import load_dataset ds = load_dataset("your-org/agent-traces") print(ds["train"][0]["trace_id"]) ``` For streaming: ```python from datasets import load_dataset ds = load_dataset("your-org/agent-traces", streaming=True) for trace in ds["train"]: print(trace["trace_id"]) ``` ## Record Shape Each JSONL line is a `TraceRecord`. A representative subset looks like: ```json { "schema_version": "0.3.0", "trace_id": "tr_01abc...", "agent": { "name": "claude-code", "model": "..." }, "task": { "description": "Fix failing tests in auth module" }, "metrics": { "total_steps": 14, "estimated_cost_usd": 0.031 }, "steps": ["..."] } ``` See the [schema overview](/docs/schema/overview) for the full contract. ## Local Lookup: Traces, Commits, And Lines Once you install the git correlator with `opentraces setup git`, local commands can resolve code history back to traces. ### Group traces by commit ```bash opentraces list --by-commit opentraces --json list --by-commit ``` ### Resolve a commit back to traces ```bash opentraces blame abc1234 opentraces blame abc1234 src/auth.py opentraces blame abc1234 src/auth.py --lines opentraces --json blame abc1234 ``` `blame` takes a commit SHA (bare or `c:`) and an optional path to scope output to one file. Use `--lines` for git-blame-style per-line output. This is useful for provenance, code archaeology, and dataset filtering by evidence quality. ## Choosing An Access Pattern - Use `hf-mount` when the consumer wants to browse files or let an agent inspect shards directly - Use `datasets` for notebooks, analysis jobs, and training pipelines - Use local `list --by-commit` and `blame` for repo-specific provenance work --- # Schema Overview opentraces uses a training-first JSONL schema where each line is one complete agent trace. The schema is a superset of ATIF v1.6, informed by ADP and field patterns from existing HF datasets. ## Design Principles 1. **Training / SFT** - Clean message sequences with role labels, tool-use as tool_call/tool_result pairs, outcome signals. 2. **RL / RLHF** - Trajectory-level reward signals, step-level annotations, decision point identification. 3. **Telemetry** - Token counts, latency, model identifiers, cache hit rates, cost estimates. 4. **Cross-agent** - Represents traces from Claude Code, Cursor, Cline, Codex, and future agents without agent-specific fields. ## Top-Level Structure ```json { "schema_version": "0.3.0", "trace_id": "uuid", "session_id": "uuid", "content_hash": "", "timestamp_start": "ISO8601", "timestamp_end": "ISO8601", "execution_context": "devtime", "task": { }, "agent": { }, "environment": { }, "system_prompts": { }, "tool_definitions": [ ], "steps": [ ], "outcome": { }, "dependencies": [ ], "metrics": { }, "security": { }, "attribution": { }, "lifecycle": "provisional", "git_links": [ ], "generation_index": 0, "metadata": { } } ``` ## Key Design Decisions | Decision | Rationale | |----------|-----------| | `steps` not `turns` | Each step is an LLM API call, not a conversational turn. Aligns with ATIF's TAO loop. | | `role: "agent"` not `"assistant"` | Follows ATIF convention (`system`, `user`, `agent`). | | Tool calls separated from observations | Preserves call/result separation training pipelines depend on. | | System prompt dedup | Hash-based lookup table. A 20K-token prompt repeated across steps would be wasteful. | | `parent_step` per step | Precise parent-child tree for sub-agents, not a flat trace-level array. | | `content_hash` | Two scopes, two algorithms by design. Top-level `TraceRecord.content_hash` is SHA-256 of the serialized record — cryptographic collision resistance for cross-contributor dedup at upload time. `AttributionRange.content_hash` is `murmur3:<32-hex>` — fast cross-tool matching of specific line ranges, per Agent Trace v0.1.0. The murmur3 prefix (added 0.3.0) replaces the prior md5-truncated form and only applies to attribution-range hashes. | | `reasoning_content` | Explicit chain-of-thought field. Improved SWE-Bench by ~3 pts (Cognition data). | | `outcome.committed` | Did the trace's changes get committed? Cheap, deterministic quality signal. | | `attribution` | Embedded Agent Trace block. Bridges trajectory (process) with code attribution (output). | ## Schema Package The schema is a standalone Python package: ```bash pip install opentraces-schema ``` ```python from opentraces_schema import TraceRecord, SCHEMA_VERSION record = TraceRecord( trace_id="abc-123", session_id="sess-456", agent={"name": "claude-code", "version": "1.0.32"}, ) line = record.to_jsonl_line() ``` See [TraceRecord](/docs/schema/trace-record), [Steps](/docs/schema/steps), and [Outcome & Attribution](/docs/schema/outcome-attribution) for field-level detail. --- # TraceRecord The top-level record. One per JSONL line, one per agent trace. ## Identification | Field | Type | Required | Description | |-------|------|----------|-------------| | `schema_version` | string | yes | Schema version, e.g. `"0.3.0"` | | `trace_id` | string (UUID) | yes | Unique identifier for this trace | | `session_id` | string | yes | Agent session reference | | `content_hash` | string | no | SHA-256 of the serialized record, populated when written | | `execution_context` | string | no | `"devtime"` (code-editing agent) or `"runtime"` (action-trajectory / RL agent). Null for pre-0.2 traces. | | `lifecycle` | string | no | `"provisional"` (session ended, not yet tied to a revision) or `"final"` (post-commit hook correlated this trace to a commit). Defaults to `"provisional"`. Added 0.3.0 (RFC #25). | | `git_links` | array\ | no | Evidence-graded links to commits/revisions this trace contributed to. A trace may link to many commits (rebase, squash, long session); a commit may link to many traces (cherry-pick, composition). Added 0.3.0. See [Outcome & Attribution](/docs/schema/outcome-attribution) for the evidence-tier taxonomy and `GitLink` fields. | | `generation_index` | integer | no | Monotonic per-`session_id` generation counter. Generations are replacement snapshots, not stitchable supersets: later generations may carry different redactions, enrichments, or security-pipeline output. Consumers resolving "latest" should group by `session_id` and take `max(generation_index)`. Added 0.3.0. | ## Timestamps | Field | Type | Required | Description | |-------|------|----------|-------------| | `timestamp_start` | string (ISO8601) | no | Session start time | | `timestamp_end` | string (ISO8601) | no | Session end time | ## Task ```json { "task": { "description": "Fix the failing test in src/parser.ts", "source": "user_prompt", "repository": "owner/repo", "repository_url": "https://github.com/owner/repo", "base_commit": "abc123def456..." } } ``` ## Agent ```json { "agent": { "name": "claude-code", "version": "1.0.83", "model": "anthropic/claude-sonnet-4-20250514" } } ``` Model identifiers follow the `provider/model-name` convention. ## Environment ```json { "environment": { "os": "darwin", "shell": "zsh", "vcs": { "type": "git", "base_commit": "abc123...", "branch": "main", "diff": "unified diff string or null" }, "language_ecosystem": ["typescript", "python"] } } ``` ## System Prompts Deduplicated into a top-level lookup table. Steps reference prompts by hash. ```json { "system_prompts": { "sp_a1b2c3": "You are Claude Code..." } } ``` ## Tool Definitions The trace-level tool schema list. ## Dependencies Package names referenced during the trace. Extracted from manifest files or tool calls. ```json { "dependencies": ["stripe", "prisma", "next"] } ``` ## Metrics ```json { "metrics": { "total_steps": 42, "total_input_tokens": 1800000, "total_output_tokens": 34000, "total_cache_read_tokens": 1650000, "total_cache_creation_tokens": 82000, "total_duration_s": 780, "cache_hit_rate": 0.92, "estimated_cost_usd": 2.4 } } ``` `total_cache_read_tokens` and `total_cache_creation_tokens` are session-level cache aggregates added in 0.3.0 (prompt-cache hits + writes across steps). ## Security ```json { "security": { "scanned": true, "flags_reviewed": 3, "redactions_applied": 1, "classifier_version": "0.1.0" } } ``` ## Metadata Open-ended object for future extensions. ## Notes - `content_hash` is filled in when the record is serialized with `to_jsonl_line()` - `task`, `environment`, `steps`, and the nested blocks all have defaults in the Python model - `security.scanned` confirms the security pipeline (scan, redact, classify) was applied - `task.repository_url` is the canonical remote URL (added 0.3.0, RFC #22). Prefer it over `repository` when normalizing across hosts. --- # Steps The `steps` array contains the conversation as a sequence of LLM API calls. Each step follows the TAO (Thought-Action-Observation) pattern from ATIF. ## Step Structure ```json { "step_index": 2, "role": "agent", "content": "I'll investigate the failing test...", "reasoning_content": "The user wants me to...", "model": "anthropic/claude-sonnet-4-20250514", "system_prompt_hash": "sp_a1b2c3", "agent_role": "main", "parent_step": null, "call_type": "main", "tools_available": ["bash", "read", "edit", "glob", "grep", "write", "agent"], "tool_calls": [], "observations": [], "snippets": [], "token_usage": {}, "timestamp": "ISO8601" } ``` ## Fields | Field | Type | Required | Description | |-------|------|----------|-------------| | `step_index` | integer | yes | Sequential step number | | `role` | string | yes | `"system"`, `"user"`, or `"agent"` | | `content` | string | no | Message content; may be empty for pure tool or warmup steps | | `reasoning_content` | string | no | Thinking content | | `model` | string | no | Model used (`provider/model-name`) | | `system_prompt_hash` | string | no | Reference to `system_prompts` lookup table | | `agent_role` | string | no | `"main"`, `"explore"`, `"plan"`, etc. | | `parent_step` | integer | no | Step index of parent (for sub-agents) | | `call_type` | string | no | `"main"`, `"subagent"`, or `"warmup"` | | `tools_available` | string[] | no | Tools available at this step | | `tool_calls` | ToolCall[] | no | Tool invocations made in the step | | `observations` | Observation[] | no | Tool results linked back by `source_call_id` | | `snippets` | Snippet[] | no | Extracted code blocks | | `token_usage` | TokenUsage | no | Per-step token usage breakdown | | `timestamp` | string | no | ISO8601 timestamp | ### `call_type` Values | Value | Description | |-------|-------------| | `main` | Primary agent step | | `subagent` | Sub-agent invocation | | `warmup` | Cache priming call with no useful output | ## Tool Calls ```json { "tool_calls": [ { "tool_call_id": "tc_001", "tool_name": "bash", "input": { "command": "npm test -- --grep parser" }, "duration_ms": 3400 } ] } ``` Tool calls carry a `tool_call_id`. Observations link back via `source_call_id`. ## Observations ```json { "observations": [ { "source_call_id": "tc_001", "content": "FAIL src/parser.test.ts...", "output_summary": "1 test failed: parser.test.ts line 42 assertion error", "error": null } ] } ``` `output_summary` is a lightweight preview so consumers can assess relevance without downloading full multi-KB outputs. ## Snippets Code blocks extracted from tool results and agent responses: ```json { "snippets": [ { "file_path": "src/parser.ts", "start_line": 42, "end_line": 55, "language": "typescript", "text": "function parseToken(input: string)..." } ] } ``` ## Token Usage Per-step token breakdown: ```json { "token_usage": { "input_tokens": 12400, "output_tokens": 890, "cache_read_tokens": 11200, "cache_write_tokens": 1200, "prefix_reuse_tokens": 11200 } } ``` ## Sub-Agent Hierarchy Sub-agent steps use `parent_step` to link back to the invoking step: ```json { "step_index": 5, "role": "agent", "agent_role": "explore", "parent_step": 3, "call_type": "subagent", "content": "Searching for related parser implementations..." } ``` Sub-agent transcripts are linked by `session_id` reference to separate trajectory records, not embedded. --- # Outcome & Attribution ## Outcome The `outcome` object captures the trace-level result and the confidence of the signal that set it: Outcome fields are split by `execution_context`. Devtime agents (code-editing) use `committed` as the primary reward proxy. Runtime agents (action-trajectory / RL) use `terminal_state` and `reward`. **Devtime example:** ```json { "outcome": { "success": true, "signal_source": "deterministic", "signal_confidence": "derived", "description": "Test passes after fix", "patch": "unified diff string", "committed": true, "commit_sha": "def789abc..." } } ``` **Runtime example:** ```json { "outcome": { "terminal_state": "goal_reached", "reward": 1.0, "reward_source": "rl_environment", "signal_confidence": "derived" } } ``` ### Fields | Field | Type | Required | Description | |-------|------|----------|-------------| | `success` | boolean | no | Did the task succeed? | | `signal_source` | string | no | Current implementation uses `deterministic` | | `signal_confidence` | string | no | `derived`, `inferred`, or `annotated` | | `description` | string | no | Human-readable outcome description | | `patch` | string | no | Unified diff produced by the session | | `committed` | boolean | no | Whether changes were committed to git (devtime) | | `commit_sha` | string | no | The specific commit, if committed (devtime) | | `terminal_state` | string | no | `goal_reached`, `interrupted`, `error`, or `abandoned` (runtime, added 0.2.0) | | `reward` | float | no | Numeric reward signal from an RL environment or evaluator (runtime, added 0.2.0) | | `reward_source` | string | no | Canonical: `rl_environment`, `judge`, `human_annotation`, `orchestrator` (added 0.2.0) | ### Committed as a Quality Signal For devtime agents, a trace that results in a commit is higher-signal than one abandoned or reverted. The commit hash gives a deterministic anchor for replaying the patch and comparing later revisions. For runtime agents, `terminal_state` and `reward` serve the equivalent role — ground truth from the environment. ## Attribution The `attribution` block records which files and line ranges were produced by the agent trace. ```json { "attribution": { "experimental": false, "revision": { "vcs_type": "git", "revision": "def789abc..." }, "unaccounted_files": ["build/generated.ts"], "files": [ { "path": "src/parser.ts", "conversations": [ { "contributor": { "type": "ai", "model_id": "anthropic/claude-sonnet-4-20250514" }, "url": "opentraces://trace/step_2", "ids": { "anthropic": "msg_01xyz" }, "related": [ {"type": "plan", "url": "opentraces://t/plan_3"} ], "ranges": [ { "start_line": 42, "end_line": 55, "content_hash": "murmur3:9f2e8a1b...", "change_type": "modification", "original": { "start_line": 42, "end_line": 54, "content_hash": "murmur3:abc123..." }, "contributor": { "type": "human", "id": "alice" } } ] } ] } ] } } ``` ### Attribution fields | Field | Type | Description | |-------|------|-------------| | `experimental` | boolean | `true` when any range is low-confidence or a fallback resolution was used; `false` when every range was produced by the PostToolUse hook or unified diff. | | `revision` | object | Pins this attribution block to a specific commit/revision. `{vcs_type: "git"|"jj", revision: }`. Added 0.3.0 (RFC #5/#25). | | `unaccounted_files` | array\ | Files changed at commit time whose hunks were not produced by any tracked Edit/Write tool call. Typically Bash-applied edits (`sed`, codemods). Surfaced at low confidence. Added 0.3.0 (RFC #26). | | `files[]` | array\ | Per-file attribution, each with a list of `conversations`. | ### AttributionConversation fields | Field | Type | Description | |-------|------|-------------| | `contributor` | object | Default contributor for all ranges under this conversation, e.g. `{type: "ai", model_id: "anthropic/claude-sonnet-4-20250514"}`. | | `url` | string | `opentraces://trace_id/step_N` link back to the producing step. | | `ids` | object | Provider-native conversation identifiers. E.g. `{anthropic: "msg_01xyz", openai: ["resp_1", "resp_2"]}`. Added 0.3.0 (RFC #9). | | `related` | array\ | Links to broader resources using the RFC #16 baseline vocabulary. Each entry: `{type, url}`. E.g. `{type: "plan", url: "opentraces://t/plan_3"}`. Added 0.3.0. | | `ranges[]` | array\ | Attributed line ranges. | ### AttributionRange fields | Field | Type | Description | |-------|------|-------------| | `start_line`, `end_line` | int | Inclusive range in the final file. | | `content_hash` | string | `murmur3:<32-hex>` hash for cross-refactor tracking. | | `confidence` | `"high"` \| `"medium"` \| `"low"` | Resolver confidence. | | `change_type` | `"addition"` \| `"modification"` \| `"deletion"` | Nature of the change. Added 0.3.0 (RFC #11). | | `original` | object | Pre-processing state for divergent ranges — set when a formatter or human rewrote the agent's output after the fact. Keys: `start_line`, `end_line`, `content_hash`. Added 0.3.0 (RFC #5). | | `contributor` | object | Per-range override of the enclosing conversation's contributor. Added 0.3.0. | ### How Attribution Is Constructed Attribution is built deterministically by a three-layer pipeline (plan 041): 1. **PostToolUse hook** — fires after each Edit/Write, reads the file from disk, and records the exact post-edit lines plus a `murmur3:` hash. Highest confidence. 2. **Unified diff**, when no hook event is present, the trace's diff is parsed to recover ranges. Medium confidence. 3. **`str.find` fallback** — last-resort textual match of tool output back to the file. Low confidence, always marked `experimental: true`. These feed a common resolver that emits Agent Trace-compatible `attribution` records and, where possible, pins them to a specific commit via `attribution.revision` and the trace's `git_links`. ## GitLink and the Evidence Tiers A `GitLink` (entries in `TraceRecord.git_links`) records one commit this trace contributed to, annotated with how strong the evidence is. ```json { "git_links": [ { "vcs_type": "git", "revision": "def789abc...", "repo_url": "https://github.com/org/repo", "branch": "main", "tier": "tool_emitted", "commit_reachable": true, "content_alive": true } ] } ``` | Field | Type | Description | |-------|------|-------------| | `vcs_type` | `"git"` \| `"jj"` | Version control system. Defaults to `"git"`. | | `revision` | string | Commit SHA or jj change id. | | `repo_url` | string | Canonical remote URL. | | `branch` | string | Branch name if known. | | `tier` | enum | Evidence tier — see below. | | `commit_reachable` | bool | Computed lazily on read; `false` if the commit was force-pushed away. | | `content_alive` | bool | Computed lazily on read; `false` if the agent's hashed bytes no longer appear at `HEAD`. | ### Evidence tiers Consumers filter by tier to build training subsets of the desired signal quality. The four tiers, strongest to weakest: | Tier | Meaning | |------|---------| | `tool_emitted` | Hashes emitted by Edit/Write tool calls appear verbatim in the commit's staged hunks. Gold-standard signal — use for SFT and RL. | | `tool_emitted_with_divergence` | The files line up, but the committed bytes don't hash-match (a formatter, pre-commit hook, or human rewrote the output). Still high value when paired with `AttributionRange.original`. | | `overlapping` | Only file-set and time-window overlap — no hash match. Safer to treat as weakly linked. | | `orphan` | No viable commit link. Keep the trace, don't claim authorship. | ## Lifecycle: Provisional vs Final `TraceRecord.lifecycle` gates when a trace is safe to treat as revision-anchored: - `"provisional"`, captured at trace end. `git_links` may be empty or speculative. - `"final"` — the `opentraces setup git` post-commit hook has correlated this trace to at least one commit and pinned `attribution.revision`. Promoted exactly once; never downgraded. Dataset consumers that want only revision-anchored traces should filter on `lifecycle == "final"` and then on `git_links[].tier`. ### The Bridge This field bridges trajectory (process) and attribution (output): - `conversation.url` links each attributed range back to the step that produced it - `content_hash` is a short stable hash for tracking attribution across refactors - Traces that produce no code changes have `attribution: null` ### Why Embed, Not Link Embedding keeps the record self-contained. An opentraces record can say "here is the full conversation that produced these lines, including the reasoning, tool calls, and final diff." ## Reserved RL Fields The schema leaves room for: - token ID sequences for RL training - token log probabilities - step-level reward annotations --- # Standards Alignment opentraces sits at the intersection of four public standards. It adopts what works from each, and bridges the gap between trajectory (process) and attribution (output). ## ATIF / Harbor (v1.6) [github.com/laude-institute/harbor](https://github.com/laude-institute/harbor/blob/main/docs/rfcs/0001-trajectory-format.md) A training trajectory serialization format for agent research. Defines the step-based TAO (Thought-Action-Observation) loop, with fields for token IDs, logprobs, and reward signals designed for RL and SFT pipelines. **Relationship:** opentraces is a superset of ATIF. We adopt the step-based model, role conventions (`system | user | agent`), and field patterns. We add attribution blocks, per-step token breakdowns, environment metadata, dependency tracking, and security metadata. The downstream field mappings live in `packages/opentraces-schema/FIELD-MAPPINGS.md`; the public export workflow is still experimental. ## ADP (Agent Data Protocol) [arxiv.org/abs/2410.10762](https://arxiv.org/abs/2410.10762) An interlingua for normalizing diverse agent trace formats into a common structure for training. Proposes a universal adapter layer so each dataset and each agent only needs one converter, O(D+A), instead of pairwise mappings, O(D*A). **Relationship:** opentraces' adapter-based normalization follows the same pattern. Per-agent parsers are ADP-style adapters outputting the enriched schema. ## Agent Trace (Cursor/community, v0.1.0 RFC) [github.com/cursor/agent-trace](https://github.com/cursor/agent-trace) A code attribution spec (CC BY 4.0) that records which lines of code came from which agent conversation, at file/line granularity. Backed by 10+ sponsors (Cloudflare, Vercel, Google Jules, Cognition). **Relationship:** opentraces embeds Agent Trace attribution blocks directly in the trace record. Agent Trace focuses on _output_ (code attribution), opentraces bridges that with _process_ (trajectory). ### Agent Trace RFCs adopted (schema 0.3.0) | RFC | Topic | Where it lands | |-----|-------|----------------| | #5 | `original` pre-processing snapshot on divergent ranges | `AttributionRange.original` | | #9 | Provider-native conversation IDs | `AttributionConversation.ids` | | #11 | `change_type` on ranges | `AttributionRange.change_type` | | #16 | Baseline `related` resource vocabulary | `AttributionConversation.related` | | #22 | Canonical `repository_url` | `Task.repository_url` | | #25 | Lifecycle / revision-pinning | `TraceRecord.lifecycle`, `Attribution.revision` | | #26 | `unaccounted_files` for non-tool edits | `Attribution.unaccounted_files` | | #27 | Evidence-graded commit linking | `TraceRecord.git_links[]`, `GitLink.tier` | Adoption is additive — pre-0.3.0 traces validate unchanged. `opentraces export --format agent-trace` emits Agent Trace v0.1.0 JSONL based on these fields. ## OTel GenAI Semantic Conventions [opentelemetry.io/docs/specs/semconv/gen-ai](https://opentelemetry.io/docs/specs/semconv/gen-ai/) OpenTelemetry's GenAI semantic conventions define standardized span attributes for LLM calls in observability pipelines, covering model names, token counts, and request metadata. **Relationship:** opentraces' per-step token usage and model fields align with OTel GenAI conventions, enabling cross-referencing between observability spans and training trajectories. ## The Core Insight Agent Trace preserves _which_ lines came from AI. ATIF/ADP preserve _how_ the agent reasoned. Neither alone tells the complete story. opentraces connects the full conversation trajectory to the specific code output at line granularity. ## Message Taxonomy opentraces adopts a training-oriented message taxonomy: | Role | Description | |------|-------------| | `system` | System prompt (deduplicated by hash) | | `user` | User message / prompt | | `agent` | Agent response, tool calls, or thinking | Agent steps are further classified by `call_type` (`main`, `subagent`, `warmup`) and `agent_role` (`main`, `explore`, `plan`). --- # Schema Versioning The opentraces schema follows semantic versioning. The version lives in `packages/opentraces-schema/src/opentraces_schema/version.py` as the single source of truth. ## Version Policy | Change Type | Version Bump | Example | |-------------|--------------|---------| | New optional field | Minor | Adding `metrics.p95_latency_ms` | | New optional model | Minor | Adding a `debugging` block | | Field rename | Major | Renaming `steps` to `turns` | | Field removal | Major | Removing `metadata` | | Type change | Major | Changing `success` from boolean to string | | Bug fix / docs | Patch | Fixing a validation regex | ## Current Version ```text 0.3.0 ``` The `0.x` series means breaking changes may still land between minor versions until `1.0.0`. ### 0.3.0 — additive changes All new fields are optional; pre-0.3.0 traces deserialize unchanged. - `TraceRecord.lifecycle` — `"provisional"` | `"final"` (RFC #25). - `TraceRecord.git_links[]` — new `GitLink` model with `vcs_type`, `revision`, `repo_url`, `branch`, `tier`, `commit_reachable`, `content_alive`. Four evidence tiers: `tool_emitted`, `tool_emitted_with_divergence`, `overlapping`, `orphan`. - `Attribution.revision` — `{vcs_type, revision}` pin (RFC #5/#25). - `Attribution.unaccounted_files` — files changed at commit time without a tracked Edit/Write source (RFC #26). - `AttributionRange.change_type` — `"addition"` | `"modification"` | `"deletion"` (RFC #11). - `AttributionRange.original` — pre-processing snapshot for divergent ranges (RFC #5). - `AttributionRange.contributor` — per-range contributor override. - `AttributionConversation.ids` — provider-native conversation IDs (RFC #9). - `AttributionConversation.related` — baseline related-resource vocabulary (RFC #16). - `Task.repository_url` — canonical remote URL (RFC #22). - `TraceRecord.generation_index` — monotonic per-`session_id` generation counter for replacement snapshots (used by `opentraces pull` and supersedes detection). - `Metrics.total_cache_read_tokens`, `Metrics.total_cache_creation_tokens` — session-level prompt-cache aggregates. - `AttributionRange.content_hash` format migrated to `murmur3:<32-hex>` (replaces the prior md5-truncated form) for cross-tool line-range matching. The top-level `TraceRecord.content_hash` is unchanged (still SHA-256 hex of the serialized record, used for cross-contributor dedup). ## Version Checks There is no public migration workflow today. Version checks happen when configs are normalized and when `TraceRecord` JSONL is loaded. A hidden `opentraces migrate` command still exists for diagnostics, but it only reports the current config and schema versions. ## Rationale Documents Each schema version ships with a rationale document and a changelog entry in the schema package. See [`VERSION-POLICY.md`](https://github.com/JayFarei/opentraces/blob/main/packages/opentraces-schema/VERSION-POLICY.md) for the full versioning policy and [`CHANGELOG.md`](https://github.com/JayFarei/opentraces/blob/main/packages/opentraces-schema/CHANGELOG.md) for the release history. ## Security Pipeline Version The security pipeline is versioned independently of the schema, under `SECURITY_VERSION` in `src/opentraces/security/version.py`. It is bumped whenever detection logic changes (regex patterns, entropy thresholds, classifier heuristics, anonymization rules). ```text SECURITY_VERSION = 0.3.0 ``` `opentraces doctor` reports the active value alongside the schema version. ## Field Mappings The repository keeps downstream mapping tables in `packages/opentraces-schema/FIELD-MAPPINGS.md`. --- # Security Tiers opentraces applies layered security scanning before traces are staged or pushed. The current pipeline version is `SECURITY_VERSION = 0.3.0`. Tip: run `opentraces doctor --security` to see the exact tiers, versions, and commands active in your current install. ## Current User-Facing Tiers The current 0.3 CLI surfaces these layers: | Tier | Name | Status | What it does | |------|------|--------|--------------| | 1a | Regex patterns | always on | Built-in secret detectors for known token and key formats | | 1b | Shannon entropy | always on | Flags high-entropy strings that look like secrets | | 1.5 | TruffleHog | optional | Runs TruffleHog locally for broader secret detection | | 2 | LLM trace review | optional, on demand | Semantic review over the whole trace transcript | | 3 | Human review | always available | Web inbox, TUI, and CLI review before upload | ## Tier 1a And 1b Regex and entropy scanning are always on. They run locally during processing and rewrite sensitive content before traces surface in the inbox. ## Tier 1.5: TruffleHog Enable Tier 1.5 with: ```bash opentraces setup trufflehog opentraces setup trufflehog --enable opentraces setup trufflehog --disable ``` Current behavior: - TruffleHog is opt-in - it runs locally with `verify_secrets = false` - findings are redacted in place - findings force human review before upload - `opentraces push --no-trufflehog` skips it for one push only Use `opentraces doctor --security` to confirm whether the binary is installed and enabled. ## Tier 2: LLM Trace Review Tier 2 sends each staged trace to a third-party LLM for a semantic review on top of the regex, entropy, and TruffleHog tiers. It is opt-in, runs out-of-band, and is stored in the global config under `security.llm_review` (one config per machine, projects inherit it). ### Configure The Reviewer Run the interactive wizard once: ```bash opentraces setup llm-review ``` The picker offers these presets out of the box: | Preset | API format | Default endpoint | API key env | |--------|------------|------------------|-------------| | `ollama` | openai-compat | `http://localhost:11434/v1` | (none, local) | | `lm-studio` | openai-compat | `http://localhost:1234/v1` | (none, local) | | `llama-cpp` | openai-compat | `http://localhost:8080/v1` | (none, local) | | `vllm` | openai-compat | `http://localhost:8000/v1` | (none, local) | | `openai` | openai-compat | `https://api.openai.com/v1` | `OPENAI_API_KEY` | | `groq` | openai-compat | `https://api.groq.com/openai/v1` | `GROQ_API_KEY` | | `openrouter` | openai-compat | `https://openrouter.ai/api/v1` | `OPENROUTER_API_KEY` | | `together` | openai-compat | `https://api.together.xyz/v1` | `TOGETHER_API_KEY` | | `anthropic-direct` | anthropic | (native SDK) | `ANTHROPIC_API_KEY` | | `custom` | any | your URL | your env var | Local Ollama models can be pulled inline from the wizard when the `ollama` binary is on PATH. ### Non-Interactive Setup Skip the picker by passing flags directly, useful in agent setups and CI: ```bash opentraces setup llm-review \ --api-format openai-compat \ --base-url http://localhost:11434/v1 \ --model gemma3n:e4b opentraces setup llm-review \ --api-format openai-compat \ --base-url https://api.groq.com/openai/v1 \ --model llama-3.3-70b-versatile \ --api-key-env GROQ_API_KEY opentraces setup llm-review \ --api-format anthropic \ --model claude-haiku-4-5-20251001 \ --api-key-env ANTHROPIC_API_KEY ``` Full flag set: | Flag | Purpose | |------|---------| | `--api-format {openai-compat,ollama,anthropic,fake}` | Wire protocol the client speaks | | `--base-url ` | Base URL (include `/v1` for openai-compat servers; ignored for `anthropic`) | | `--model ` | Model name or tag | | `--api-key-env ` | Env var holding the API key; empty for local servers | | `--timeout ` | Request timeout, defaults to `120` | | `--enable` | Turn llm-review on using current config | | `--disable` | Turn llm-review off without changing other fields | | `--test` | Ping the endpoint without writing config, exits non-zero on failure | | `--print` | Print the effective config as JSON and exit | | `--no-interactive` | Skip the preset picker when no flags are given | | `--project` | Scope this change to the project marker instead of global config | ### Run Tier 2 On Demand ```bash opentraces llm-review # every staged trace opentraces llm-review --scope staged # second line of defence before push opentraces llm-review --scope inbox # pre-add only opentraces llm-review --trace 8a3f1c # one trace (short id ok; repeatable) opentraces llm-review --limit 5 # cap the batch opentraces llm-review --force # re-review cached verdicts opentraces llm-review --dry-run # estimate sessions, chars, tokens, cost opentraces llm-review --context-file AGENTS.md # project context (README/AGENTS.md up to 10KB) ``` Per-run overrides match the setup flags so you can try a different backend without touching config: ```bash opentraces llm-review \ --api-format openai-compat \ --base-url https://api.groq.com/openai/v1 \ --api-key-env GROQ_API_KEY \ --model llama-3.3-70b-versatile ``` Verdicts are cached into `metadata.llm_review` on each trace so downstream gates and the TUI can see them. A bad verdict also blocks the trace in state so later push flows skip it. ### Gate The Push ```bash opentraces push --llm-review ``` This fails upload unless every staged trace has a clean Tier 2 verdict. Typical flow before a public dataset push: `opentraces llm-review --scope staged` followed by `opentraces push --llm-review`. ### Doctor Output `opentraces doctor` (and the focused `opentraces doctor --security` subview) surfaces everything configured for Tier 2: - state: `disabled`, `on-demand`, or `unreachable` - backend and model, inferred from the endpoint (for example `ollama / gemma3n:e4b`, `groq / llama-3.3-70b-versatile`) - `endpoint` URL and `api` format - `api key env` var name and whether it is currently `set` or `unset` - probe status, including model count at the endpoint and whether your configured model is in the list; flagged as `not found` when the endpoint answers but does not expose the model, `not installed` when the binary is missing, or `not set` when a required API key env var is empty - toggle hints for `run`, `gate push`, `reconfigure`, and `disable` Use `doctor` to confirm the tier is healthy before relying on `push --llm-review` as an upload gate. ## Tier 3: Human Review Human review is always available through: ```bash opentraces web opentraces tui opentraces list --stage inbox opentraces show opentraces redact ``` This is the final check for project-specific context, sensitive business details, and traces that are technically safe but not worth publishing. ## Review Policy Each repo carries a review policy in `.opentraces.json`: ```bash opentraces setup review-policy --review opentraces setup review-policy --auto ``` | Policy | Effect | |--------|--------| | `review` | Every trace lands in Inbox for manual review | | `auto` | Safe traces are auto-approved into `staged` | `auto` does not push automatically. Upload remains explicit. ## What Can Still Block The user-facing pipeline is designed to redact and route most issues into review, but some failures still stop upload: - parse errors - missing required integrations you explicitly enabled - `push --llm-review` when staged traces lack a clean Tier 2 verdict Use `opentraces doctor` for pipeline failures and `opentraces list --stage blocked` for traces that still need intervention. --- # Security Configuration Security settings now live in two places: - global machine-local config: `~/.opentraces/config.json` - per-repo portable marker: `/.opentraces.json` Machine-local traces and runtime state live separately under `~/.opentraces/projects//`. ## Global Config Inspect it with: ```bash opentraces config show opentraces --json config show ``` Common global keys include: - `excluded_projects` - `custom_redact_strings` - `classifier_sensitivity` - `dataset_visibility` - `security.trufflehog.*` - `security.llm_review.*` Examples: ```bash opentraces config set classifier_sensitivity high opentraces config set custom_redact_strings ACME_INTERNAL_TOKEN --append opentraces config set excluded_projects /path/to/client-repo --append ``` ## Project Marker The repo-local `.opentraces.json` carries portable policy: ```json { "marker_version": "2", "project_id": "...", "review_policy": "review", "push_policy": "manual", "remotes": { "origin": { "url": "owner/opentraces", "visibility": "private" } }, "active_remote": "origin", "default_visibility": "private", "agents": ["claude-code"] } ``` Depending on the repo, it may also carry fields like `root_commit_sha` and `first_run_backfill_decision`. Write project-scoped values with: ```bash opentraces config set review_policy auto --project opentraces config set default_visibility private --project ``` ## Preferred Setup Commands For the security integrations themselves, prefer the dedicated setup commands over raw `config set`: ```bash opentraces setup trufflehog opentraces setup llm-review opentraces setup review-policy --review ``` These commands validate the environment and keep the config shape correct. ## Exclusions Exclude entire repos from collection: ```bash opentraces config set excluded_projects /path/to/private-repo --append ``` ## Custom Redaction Strings Add strings that should always be scrubbed: ```bash opentraces config set custom_redact_strings corp-api-prefix- --append opentraces config set custom_redact_strings INTERNAL_BILLING_TOKEN --append ``` ## TruffleHog Settings Tier 1.5 is stored under `security.trufflehog` in the global config: ```json { "security": { "trufflehog": { "enabled": true, "verify_secrets": false } } } ``` `verify_secrets` stays off by default so the scanner does not make outbound verification calls. ## LLM Review Settings Tier 2 review is stored under `security.llm_review`: ```json { "security": { "llm_review": { "enabled": true, "api_format": "openai-compat", "base_url": "http://localhost:11434/v1", "model": "gemma4:latest", "api_key_env": "", "timeout": 120.0, "prompt_version": "1" } } } ``` The reviewer config is machine-local and shared across projects unless you explicitly scope setup to a project. --- # Scanning & Redaction The security pipeline is context-aware and runs in two passes: 1. Scan the trace record field-by-field using the field type to decide whether entropy analysis is enabled. 2. Scan the final serialized JSONL bytes to catch anything introduced during enrichment or serialization. ## What Gets Scanned | Field | Context | Notes | |-------|---------|-------| | `system_prompts` | General | Full scan | | `task.description` | General | Full scan | | `steps[].content` | General | Full scan | | `steps[].reasoning_content` | Reasoning | Regex only, no entropy | | `steps[].tool_calls[].input` | Tool input | Full scan for input-like tools, regex-only for result-like tools | | `steps[].observations[].content` | Tool result | Regex only, no entropy | | `steps[].observations[].output_summary` | Tool result | Regex only, no entropy | | `steps[].observations[].error` | Tool result | Regex only, no entropy | | `steps[].snippets[].text` | General | Full scan | | `outcome.patch` | General | Full scan | | `environment.vcs.diff` | General | Full scan, truncated before storage when the repo diff is very large | The scanner also applies a second pass over the serialized JSONL output so redaction does not depend on field shape alone. ## What Gets Redacted Detected secrets and path fragments are replaced with `[REDACTED]` or hashed path segments, depending on the detector: ```text Before: export OPENAI_API_KEY=sk-abc123... After: export OPENAI_API_KEY=[REDACTED] ``` ```text Before: /Users/jay/src/project/... After: /Users/[REDACTED]/src/project/... ``` The staged JSONL is rewritten in place. Raw Claude Code session files on disk are not modified. > **Note:** Detected secrets can also be replaced with named placeholders such as `[API_KEY_1]`, `[EMAIL_2]`, `[PERSON_3]` when the EntityMap is in use. `USER_PATH` entities normalize paths in place (for example replacing only the username segment). See [Security Tiers](/docs/security/tiers) for the full tier model and the optional Tier 1.5 / 1.8 / 2 layers. ## Heuristic Classifier A heuristic classifier runs on top of scanning and redaction. It flags: - internal hostnames - AWS account IDs in ARNs - database connection strings - internal collaboration URLs - dense UUID / hash sequences - deep file paths that may reveal internal structure ## Custom Redaction ```bash opentraces config set custom_redact_strings INTERNAL_API_KEY --append opentraces config set custom_redact_strings corp-secret-prefix- --append ``` Custom redaction strings are treated as literal matches wherever they appear in trace content. `--append` adds to the existing list instead of replacing it. --- # Agent Setup opentraces is designed to be usable from both a shell and another coding agent. ## Project-Local Setup Inside a repo, the normal path is: ```bash opentraces auth login opentraces init --agent claude-code --review-policy review ``` `init` writes `.opentraces.json`, registers the repo in the global config, installs the capture hook unless you opt out, and installs the bundled opentraces skill into the project. ## Claude Code Claude Code is the current live-capture adapter. For a full setup: ```bash opentraces setup claude-code opentraces setup git opentraces setup skill ``` What each integration does: - `setup claude-code` installs the `Stop` and `PostCompact` hooks in `~/.claude/settings.json` - `setup git` installs the post-commit correlator that powers `opentraces blame` - `setup skill` installs the vendor-neutral skill under `~/.agents/skills/opentraces/` and links it into supported harnesses ## Machine-Readable Agent Flows Agents should prefer `--json` when they need structured output: ```bash opentraces --json status opentraces --json list --stage inbox opentraces --json show opentraces --json config show ``` That avoids scraping human-oriented terminal layouts. ## Review And Push By Agent A coding agent can drive the normal human workflow: ```bash opentraces web opentraces add --all opentraces push ``` Or the stricter security path: ```bash opentraces llm-review --scope staged opentraces push --llm-review ``` ## Dataset Import Hermes support is currently an import path, not a live-capture harness: ```bash opentraces pull owner/dataset --parser hermes ``` --- # CI/CD & Automation Use the same explicit workflow in automation that you use locally: initialize, inspect, stage, push. ## Authentication `HF_TOKEN` is the preferred CI path: ```bash export HF_TOKEN=hf_... ``` You do not need to run `opentraces auth login` when `HF_TOKEN` is already set in the environment. ## Recommended Pattern For headless runs: ```bash opentraces init --review-policy review --remote my-org/opentraces --no-hook opentraces list opentraces add --all opentraces push --private --yes ``` If the runner needs to import traces from another dataset first: ```bash opentraces pull owner/dataset --parser hermes --auto opentraces push --private --yes ``` ## Health Checks Run these before a gated push: ```bash opentraces doctor opentraces doctor --security ``` If you rely on optional integrations, configure them explicitly in automation: ```bash opentraces setup trufflehog --enable opentraces setup llm-review --enable ``` Those commands assume the required binary or endpoint is already available. ## GitHub Actions Example ```yaml - name: Install opentraces run: pip install opentraces - name: Initialize project env: HF_TOKEN: ${{ secrets.HF_TOKEN }} run: opentraces init --review-policy review --remote my-org/opentraces --no-hook - name: Stage traces env: HF_TOKEN: ${{ secrets.HF_TOKEN }} run: opentraces add --all - name: Push traces env: HF_TOKEN: ${{ secrets.HF_TOKEN }} run: opentraces push --private --yes ``` ## Notes - Use `--private` for proprietary codebases - Use `--repo owner/dataset` or `--remote ...` for shared team datasets - Use `push --llm-review` only if llm-review is already configured and reachable on the runner --- # Post-processor contract opentraces can pipe any trace through an ordered chain of external commands pre-upload (during `opentraces push`). A post-processor is any executable on `PATH` (or an absolute path) that speaks this small contract. ## Protocol Per invocation: - **stdin**: one complete trace as JSON, matching the current `TraceRecord` schema (`packages/opentraces-schema`). - **stdout**: one trace as JSON, same shape. This replaces the in-flight trace for the next step in the chain. - **exit 0**: success. - **non-zero exit**: failure. By default non-fatal (logged, chain continues with the pre-invocation trace). Under `--strict`, promoted to a hard error that halts the pipeline. - **byte-identical stdout**: explicit no-op. Recorded as `status=noop`, never as a failure. Environment variables and argv are passed through from the processor's config entry. `opentraces doctor` probes every configured processor and reports whether the binary resolves. ## Configuration Declared as an ordered list under a project's `.opentraces.json`: ```json { "post_processors": [ { "name": "my-tagger", "command": "/usr/local/bin/my-tagger", "args": ["--some-flag"], "env": {"LOG_LEVEL": "debug"} } ] } ``` Fields: | Field | Type | Default | Meaning | |-----------|-------------------------------|------------|--------------------------------------------| | `name` | string | (required) | Human label; shown in `doctor` + logs. | | `command` | string | (required) | Executable on `PATH` or absolute path. | | `args` | `list[str]` | `[]` | argv passed through. | | `env` | `dict[str, str]` | `{}` | Extra env vars layered on the inherited env.| ## Invariants - **Redaction ordering** — processors always see a post-redaction trace. `opentraces.pipeline` runs the security scrubber first; the ordering is covered by a test that fails if anyone reorders the pipeline. - **Schema validation** — stdout is parsed as `TraceRecord`. Invalid output is rejected; under `--strict` it raises, otherwise it's recorded as `status=invalid_output` and the pre-invocation trace is preserved. ## Minimal example (Python) ```python #!/usr/bin/env python3 """Annotate every trace with a static tag.""" import json, sys trace = json.loads(sys.stdin.read()) trace.setdefault("metadata", {})["processed_by"] = "my-tool" sys.stdout.write(json.dumps(trace)) ``` Make it executable (`chmod +x`), put it on `PATH`, and list it under `post_processors` in your project config. ## Reference implementation `opentraces.core.processors.run_processor` / `run_chain` (`src/opentraces/core/processors.py`) drive every invocation. See `tests/test_processors.py` for the complete matrix of happy-path and failure-mode tests built against stub binaries. --- # Development ## Setup ```bash git clone https://github.com/JayFarei/opentraces cd opentraces python3 -m venv .venv source .venv/bin/activate pip install -e packages/opentraces-schema pip install -e ".[dev]" ``` ## Optional Dependencies ```bash pip install -e ".[web,tui]" # Web and TUI inbox clients pip install -e ".[release]" # Build and publish tools (build, twine) ``` ## Running Tests ```bash pytest tests/ -v ``` Some tests require real Claude Code trace data and are skipped by default. To run them, set the env var pointing to your project's Claude Code sessions directory: ```bash export OPENTRACES_TEST_PROJECT_DIR=~/.claude/projects/ pytest tests/ -v ``` The repository also has frontend test suites under `web/viewer/` and buildable docs under `web/site/`. ## Layout Core directories: - `packages/opentraces-schema/` - standalone schema package - `packages/opentraces-ui/` - shared UI package and design system - `src/opentraces/cli/` - Click command surface - `src/opentraces/core/` - config, paths, workflow, state, review, inbox, publish flow - `src/opentraces/capture/` - live parsers, importers, and hook installers - `src/opentraces/publish/` - serializers and Hugging Face publishing - `src/opentraces/enrichment/` - git signals, attribution, dependencies, metrics - `src/opentraces/quality/` - scoring and upload gates - `src/opentraces/security/` - redaction and scanning pipeline - `src/opentraces/clients/` - TUI and web backend clients - `web/viewer/` - React trace review UI - `web/site/` - Next.js docs and marketing site - `tests/` - Python test suite The CLI version lives in `src/opentraces/__init__.py`. The schema version lives in `packages/opentraces-schema/src/opentraces_schema/version.py`. ## Adding A Parser 1. Create a capture adapter under `src/opentraces/capture/` 2. Implement `SessionParser` or `FormatImporter` from `src/opentraces/capture/_base.py` 3. Register it in `src/opentraces/capture/__init__.py` 4. Add tests under `tests/` ## Notes - The current live-capture adapter is Claude Code - Hermes currently ships as an import path via `opentraces pull --parser hermes` - The public inbox workflow is `web/tui/list/show -> add/reject/redact -> push` --- # Schema Changes The opentraces schema is open source. Feedback, questions, and proposals are welcome via [GitHub Issues](https://github.com/JayFarei/opentraces/issues). ## How to Propose a Change When suggesting a schema change, include: 1. **What** field or model you would add, change, or remove 2. **Why** it matters for your use case (training, analytics, attribution, etc.) 3. **How** it relates to existing standards (ATIF, Agent Trace, ADP, OTel) if applicable ## What Counts as Breaking | Change | Version Bump | |--------|-------------| | New optional field | Minor | | New optional model | Minor | | Field rename | Major | | Field removal | Major | | Type change | Major | See [Versioning](/docs/schema/versioning) for full policy. ## Adapter Contributions To add support for a new live-capture agent, implement the `SessionParser` protocol in `src/opentraces/capture/_base.py` and register it in `src/opentraces/capture/__init__.py`. For dataset or file imports, implement `FormatImporter` instead. ## Review Process - Schema changes are reviewed by the maintainers - Breaking changes require a new rationale document - All changes are documented in the [CHANGELOG](https://github.com/JayFarei/opentraces/blob/main/packages/opentraces-schema/CHANGELOG.md)