Dataset Rows

A dataset is a local HuggingFace-shaped row store produced by a workflow. It is not the raw trace bucket.

Per the Seal Family contract (ADR-0008), a dataset is one of exactly two things that "seal": a growing, reviewed seal, appended to under review gates, with every row carrying the full contract triple back to its source. (The other is a capsule, an immutable, URL-addressed seal of one scope.) A PR body, a standup, or a dashboard built from dataset rows is a rendering of the projection, not a seal.

Create

opentraces workflow create my-workflow --template default
opentraces dataset new my-dataset --workflow ./workflows/my-workflow/

Skill-episode datasets can start from observed skill usage without writing a custom workflow:

opentraces trace query --skill opentraces --json
opentraces dataset new opentraces-episodes --from-skill opentraces

Ad-hoc row seeding is available when you already have JSONL:

opentraces dataset new my-import --rows-file rows.jsonl --schema schema.json

Run

opentraces dataset run my-dataset --dry-run --limit 5 --json
opentraces dataset run my-dataset
opentraces dataset run opentraces-episodes --executor script --json
opentraces dataset run my-dataset --scope trace --trace <trace-id>
opentraces dataset run my-dataset --since-last-run
opentraces dataset run my-dataset --answers answers.json --json

dataset run invokes the workflow and appends rows locally. It can read from Trace Index candidates, a project scope, the current working directory, a specific trace, or the saved skill query from dataset new --from-skill. The script executor runs the workflow package's deterministic scripts/build_rows.py without a live agent.

Row Provenance: The Contract Triple

Per ADR-0008, a projection is explainable iff it is a pure function of content-addressed inputs: projection = f(scope_ref, workflow@digest, bucket_state@digest, answers). Every appended row records this as its provenance (.opentraces/row_provenance.jsonl inside the dataset directory, schema opentraces.dataset.row_provenance.v2):

{
  "ref": "<trace>:120-148",
  "workflow": {"skill": "skill-command-trajectory-eval-v1", "digest": "sha256:…"},
  "bucket": {"manifest_digest": "sha256:…", "snapshot_digest": "sha256:…"},
  "answers": {"digest": "sha256:…", "recorded": {}},
  "contract_triple": {
    "workflow_digest": "sha256:…",
    "bucket_digest": "sha256:…",
    "answers_digest": "sha256:…"
  },
  "reconstructable": true,
  "isolation": {"sandbox_tier": "none"}
}

reconstructable is an honesty label on the run that produced the row: true for the script executor (a recorded, re-runnable transform) and dry-run, false for current-agent (raw agent emission is not a recorded input, so it is never claimed reconstructable). contract_triple is the convenience roll-up of the three content digests that pin the row; answers carries the fourth input, recorded judgment answers from a --answers-driven run (see Dataset Workflows: the judgment handshake), digested even when empty so the triple is never partially populated.

Verify

opentraces dataset verify my-dataset --json

Re-executes the bound workflow against the bucket (with the recorded answers, from the same run packet the contract triple points at) in a side-effect-free mode, projects the re-run through the same sanitize/validate/dedup/canonical transform an append would apply, and byte-compares it against the stored data/train.jsonl rows. Classifies the outcome into exactly three honest verdicts:

Verdict	Meaning	Exit code
`reproduces`	The re-run rows are byte-identical to the stored rows	0
`bucket-advanced`	Stored rows are a strict subset of the re-run and the bucket watermark moved past the recorded one (an explained delta)	0
`integrity-failure`	A stored row was hand-mutated (its bytes no longer hash to the recorded `payload_hash`), or the rows differ with no watermark explanation	7

--json emits the frozen opentraces.dataset.verify.v1 envelope (verdict, stored_row_count, reproduced_row_count, byte_identical, delta, mutated_rows, recorded_watermark, current_watermark, detail). The command never appends rows or advances a cursor/watermark; it is a pure read that enforces the integrity invariant on the transform (digested bytes == installed bytes == executed bytes).

Dataset Security Policy

Each dataset carries its own resolved security policy in the manifest (DatasetManifest.security). It is seeded from the source workflow's front-matter security: contract at dataset new --workflow <path> and pinned to that workflow's digest (source_workflow_digest). The resolved enabled_tools start as the contract's required tools plus its default_enabled_tools, in canonical registry order.

The reader's security floor is non-overridable. Every dataset row is sanitized by at least regex, entropy, business_logic, and path_anonymizer — the DATASET_ROW_FLOOR — regardless of what the source workflow's contract declares. A third-party workflow contract can only add tools to this floor, never narrow below it; the floor runs even when a contract enables a smaller set and even at privacy_tier: off (which therefore no longer ships rows verbatim). Because the floor is unioned in as the last resolution step, no policy flag — including an authored allow_disable_required + --unsafe-override — can drop a floor tool from what actually runs (an override on a floor tool is recorded but does not weaken sanitization). Each row records its author-declared requested_tools distinctly from the floor-resolved effective_tools, and dataset status --json surfaces security.reader_floor

security.floor_satisfied so the guarantee is inspectable. This enforces the promise that the redaction rules that run are the consumer's, not the author's.

The policy is per-dataset, not a global config toggle. Toggling a tool on one dataset never affects another dataset or the bucket egress policy.

opentraces dataset security my-dataset
opentraces dataset security my-dataset --json

--json emits the resolved policy under a security block: source, source_workflow_digest, required_tools, optional_tools, enabled_tools, disallowed_tools, overrides, scope (always dataset), required_satisfied, and missing_required_tools.

Toggle an optional tool on a single dataset:

opentraces dataset security my-dataset --tool business_logic --enable
opentraces dataset security my-dataset --tool path_anonymizer --disable

--tool is repeatable and requires --enable xor --disable. Only optional tools can be toggled this way. A required tool can be disabled only when the workflow contract sets allow_disable_required: true and you pass --unsafe-override (optionally with --reason "<text>"); the opt-out is recorded in the manifest as an override. If the contract forbids it, the command exits 2.

opentraces dataset security my-dataset --tool regex --disable --unsafe-override --reason "rows are synthetic fixtures"

This is distinct from opentraces bucket security, which governs the machine-wide bucket egress policy over global tool flags. Dataset security governs what a dataset's rows carry before dataset publication.

Review States

State	Meaning
`inbox`	Row needs review
`approved`	Row is publishable
`published`	Row was uploaded upstream
`rejected`	Row is kept local only
`blocked`	Row needs action before approval

opentraces dataset status my-dataset --json
opentraces dataset review my-dataset --json
opentraces dataset review approve my-dataset <row-id>
opentraces dataset review reject my-dataset <row-id>
opentraces dataset review reset my-dataset <row-id>
opentraces dataset review approve my-dataset --all

Remotes

opentraces dataset remote create my-dataset owner/team-traces --private
opentraces dataset remote create my-dataset owner/existing-traces  # idempotent: binds an existing HF dataset too
opentraces dataset remote list my-dataset --verbose
opentraces dataset remote visibility my-dataset owner/team-traces --public
opentraces dataset remote remove my-dataset owner/team-traces

Dataset remotes are independent of bucket remotes. A private bucket remote can hold raw evidence while a dataset remote holds only approved projected rows.

Schedules

opentraces dataset schedule add my-dataset --every 1h --approve-new --publish-check-only
opentraces dataset schedule list
opentraces dataset schedule pause my-dataset
opentraces dataset schedule resume my-dataset
opentraces dataset schedule remove my-dataset

Schedules rerun workflows over retained evidence. They do not bypass review or publication gates unless you explicitly pass approval/publish flags.

●HUMAN ○MACHINE