Trust Discipline

"Verify me, not trust me." The Citation schema is falsifiability. Every claim can be traced. Every source type has different trust characteristics. What the agent refuses to do is its character.

Every Claim Traced to a Source

These claim-citation pairs come from real trace data. Click or hover to see the source. Different source types are flagged differently.

"Current agents are LLMs with basic function-calling, not truly autonomous systems."

source: article

Traceable to: IBM Think article — verifiable by reading the source. This is the strongest citation type because anyone can check it.

"76% of surveyed executives already view agentic AI as more like a coworker than a tool."

source: past_brief

Traceable to: Brief stored 2026-05-25T17:42:37. The original article can be retrieved from the brief's metadata. Cross-referenceable.

"The bottleneck in agentic AI workflows is the personal scaffolding layer — CLAUDE.md files and skill files encoding local knowledge."

source: past_brief

Traceable to: Brief stored 2026-05-23T09:53:37. This is a synthetic insight — the agent connected multiple sources to form this claim.

"99% of 1,000 enterprise developers surveyed are exploring or developing AI agents."

source: article

Traceable to: IBM/Morning Consult survey cited in the article. This is a specific statistic — the most falsifiable type of claim.

The Four Source Types

Not all sources are equal. The schema encodes this difference.

article

The claim comes from the input article. Anyone can open the article and verify.

Trust: Verifiable. Highest confidence.

past_brief

The claim comes from a previously stored brief. The original source is one hop away.

Trust: Cross-referenceable. Good confidence.

web_search

The claim was verified through an external search. Independent corroboration.

Trust: Corroborated. Variable confidence.

model_knowledge

The model "just knows" this. No external source. This is the honest admission of uncertainty.

Trust: Assertion only. Treat as claim, not fact.

model_knowledge is the honest source type. Many systems hide that they're pulling from training data. This agent admits it. When a claim is flagged model_knowledge, it's saying: "I believe this, but you should verify it." The flag is the honesty mechanism.

What the Agent Refuses to Do

Character is defined by refusals. These are the bright lines the agent won't cross.

Make a claim without a citation.Every assertion must trace to a source. No citation = no claim.

Fabricate a source.If the agent doesn't know where a claim came from, it says model_knowledge. It never invents an article, a study, or an expert.

Mislead with confidence.A skimmed article is flagged as skim. A novel topic is flagged as novel. The agent never pretends to know more than it does.

Search the web when the article is its own source.The agent chose stop_here for the IBM article because searching would surface the article itself. Circular verification is not verification.

When a Fake Claim Gets Caught

The post-processing filter catches claims that don't match any known source. This is what happens.

❌ Claim Rejected

"AI agents will replace 73% of knowledge workers by 2027 according to a McKinsey study."

POST-PROCESSING FILTER: No matching source found. The claim references a "McKinsey study" that does not appear in the input article, any past brief, or any web search result. Citation cannot be verified. Claim removed from output. The agent does not invent statistics.

Falsifiability is the design principle.

Every claim the agent makes can be proven wrong. That's the point. A system where claims can't be checked is a system that can't be trusted. The Citation schema makes the agent checkable. That is the trust architecture — not "trust the agent" but "verify the agent."