docs / security / scanning

Scanning & Redaction

Scanning happens only when a caller opts into tools. The public entrypoints are:

  • Python: opentraces.security.sanitize_record(record, tools=[...])
  • Python config mode: sanitize_record(record, cfg=cfg)
  • CLI: opentraces security sanitize --tools ...
  • CLI config mode: opentraces security sanitize --use-config

Callers must pass either an explicit tool list or a config. Config mode runs only tools whose cfg.security.<tool>.enabled flag is true.

Tool Kinds

KindExamplesBehavior
Detectorregex, entropy, trufflehog, privacy_filter, llm_pii, business_logicEmits redactable spans
Transformerpath_anonymizer, capsule_scopeRewrites the record without span findings
JudgeclassifierEmits a verdict without mutating content

Field Context

Detector tools receive a field-type hint so they can be stricter on inputs and less noisy on tool output:

Field typeTypical use
tool_inputshell commands, file writes, API payloads
tool_resultcommand output and observations
reasoningagent reasoning text
generalprompts, summaries, snippets, row text

CLI example:

printf '%s\n' '{"text":"curl -H Authorization: Bearer sk-demo"}' \
  | opentraces security sanitize --tools regex --field-type tool_input

Patch And Bucket Evidence

Schema 0.6.0 removed Outcome.patch. A workflow that needs to sanitize patch content should read from TraceRecord.patches[] and the bucket Trail companion (trail.jsonl.gz) instead of expecting a single unified diff field.

Raw bucket evidence is retained by default. Sanitized dataset rows are a workflow projection over that evidence; they do not rewrite the original agent transcript or the raw capture bucket unless the workflow explicitly writes a new sanitized artifact.

Dataset Required Tools And Provenance

A dataset carries a resolved security policy in its manifest, seeded from its workflow's security: contract. The policy's required tools must run for a row to be publishable: opentraces dataset publish --check-only blocks any row that does not satisfy them (block reason required_security_tools_missing).

Row provenance records the policy tools applied per append (a security_policy block on the row), and opentraces dataset run exposes the resolved policy in the run packet (run_packet.json has a security block) so the executor knows which tools are required and enabled. Inspect or adjust a dataset's policy with opentraces dataset security <name>.

Redaction Shape

Detectors replace matched spans with redaction markers. The exact marker can vary by tool and field:

Before: export OPENAI_API_KEY=sk-abc123...
After:  export OPENAI_API_KEY=[REDACTED]

Path anonymization is a transformer:

Before: /Users/alice/src/client-project/
After:  /Users/[REDACTED]/src/client-project/

Custom Strings

Custom redaction strings can be configured for workflows that use config mode:

opentraces config set custom_redact_strings INTERNAL_API_KEY --append
opentraces config set custom_redact_strings corp-secret-prefix- --append

Custom strings are literal matches wherever they appear in scanned content.