Autonomous-agent comparison package

Agents of Chaos is a strong external stress map for DBaD.

The comparison is not a claim that DBaD solves every agentic safety problem. It is a public mapping from observed autonomous-agent failure families to the controls DBaD currently makes inspectable.

Primary references: arXiv:2602.20021 and the Agents of Chaos project report.

DBaD stance

Protocol evidence, not agent safety certification DBaD records, validates, and blocks trust propagation under declared rules. It does not prove model truth, intent, or moral correctness.
Authority must be explicit Owner identity, verifier independence, lineage, reliance, reset boundaries, and token authority must survive as machine-readable trace state.
Reports must be falsifiable The public package should include live traces, copied safe citations, verifier requests, screenshots, and explicit negative tests.

Comparison matrix

Observed failure family to DBaD control surface

Scope boundary before reading the table: this matrix is not a proof that DBaD prevents autonomous-agent harm. DBaD is not a sandbox, not proof of truth, not real-world identity proofing, and not a substitute for runtime permission gates. The table shows what DBaD can make inspectable when an integration actually records authority, lineage, reliance, validation freshness, and non-authorization state.

Agents of Chaos failure family DBaD control or evidence surface What to show publicly Remaining gap
Unauthorized compliance with non-owners or impersonators Verifier independence, actor continuity, operator environment ID/scope/authority, provenance class. Trace API fields for operator_env_id, operator_env_scope, operator_env_authority_level, and reset-verifier registry fixtures. DBaD does not authenticate real-world humans by itself; identity proof remains an integration responsibility.
Destructive tool execution or irreversible action Trust-positive continuation requires a fresh trust-continuation check; copied JSON and safe citations are never authorization. Negative verifier examples: copied trace JSON rejected as authorization, safe citation accepted_as_authorization=false, and token check failure states. DBaD must be placed in front of the tool/action boundary; it cannot stop a client that never calls it.
False completion reports where system state contradicts the agent report Expected outcome order, state-transition evidence presence, receipt freshness, validation epoch, and lineage snapshot hash. A before/after trace pair where an old report is historical evidence but not current permission, plus stale receipt rejection. Outcome truth still depends on evidence quality; DBaD exposes missing or stale evidence, not omniscience.
Cross-agent propagation of unsafe practice Cross-trace lineage integrity, reliance refs, transitive reliance epoch checks, coverage exposure rules, and same-resource lineage checks. Fixture suite: broken root, declared child, grandchild, same-resource orphan, rejected reliance on analysis, and transitive reliance rejection. Undeclared off-protocol influence remains advisory unless it is represented as machine-readable lineage, reliance, or resource identity.
Information disclosure or unsafe citation of sensitive evidence Safe citation, archival projection, non-authorization headers, display-safe fields, and machine-only raw/token field markers. Safe citation and archival projection copy payloads, plus verifier responses for complete, partial, and context-mismatched artifacts. Redaction policy for sensitive evidence refs needs a dedicated package before high-stakes public logs.
Resource exhaustion, denial of service, or runaway loops Lineage depth limits, fail-closed depth exhaustion, validation TTLs, token expiration, and explicit deferred token-introspection trigger criteria. Depth-limit fixture behavior, token valid-from/valid-until fields, and public deferred-item language. Rate limiting and operational quotas are platform controls; DBaD records boundaries and failure states but is not a full runtime sandbox.

Report package

What we should publish when presenting this comparison

1. Threat mapping

One table of paper failure families, DBaD control surfaces, acceptance tests, and current limitations. Do not claim DBaD prevents all agent harm.

2. Reproducible evidence bundle

Live trace URLs, API JSON, copied safe citations, archival projections, verifier POST examples, and stale-token / stale-receipt negative checks.

3. Human-readable screenshots

Desktop and mobile crops showing that NOT AUTHORIZATION, provenance, token requirement, and point-in-time wording survive first-screen viewing.

4. Open limitations

Separate DBaD controls from sandboxing, identity proofing, secrets redaction, rate limits, and real-world evidence quality.

Recommended public claim

Use this wording until the package has external review.

DBaD is a trace-governance protocol designed to make authority, lineage, reliance, validation freshness, and non-authorization boundaries inspectable for autonomous-agent workflows. The Agents of Chaos study is a useful external stress map because it documents failures involving delegated authority, tool access, cross-agent propagation, and state/report mismatch. DBaD addresses the evidence and trust-propagation layer; it does not replace sandboxing, human identity proofing, evidence quality review, or runtime security controls.