Why This Matters
The promise of AI cross-auditing is simple: have one AI check another's work. If GPT-4 hallucinates, let Claude catch it. If Claude has a blind spot, let Gemini find it. Adversarial oversight through redundancy.
But auditing requires independence. When the SEC audits a company, we don't let the company's subsidiary do the audit. When the FAA inspects a plane, we don't let the airline's sister company sign off. Independence is the structural foundation that makes oversight meaningful.
The problem with AI "independence"
Look at the family tree. The connections are pervasive:
- Shared personnel: The people who built GPT-3 literally founded Anthropic and built Claude. The people who built LLaMA at Meta left to found Mistral. The CEO of Mistral is a co-author on DeepMind's Chinchilla paper. A small community of ~200 researchers built most of what exists.
- Shared training data: Almost every frontier model is trained on Common Crawl, Wikipedia, books, and code scraped from the same internet. The exact overlap is unknown but estimated at >60% for major Western models.
- Shared architecture: Every major model descends from a single 2017 paper, "Attention Is All You Need," by 8 Google Brain researchers. At least 6 of the 8 authors have since left to found or join AI startups. The Transformer is the common ancestor of everything.
- Direct distillation: Some models are literally trained on another model's outputs. Alpaca = Meta's weights + OpenAI's outputs. Vicuna = Meta's weights + ChatGPT conversations. These are genetic hybrids.
- Methodological inheritance: The RLHF technique developed at OpenAI for InstructGPT was carried directly to Anthropic by the researchers who created it, where it evolved into Constitutional AI. The training methodology DNA is shared even when the weights are not.
If two models share the same researchers, were trained on the same data, use the same architecture, and one was partially trained on the other's outputs, in what meaningful sense are they "independent" auditors of each other?
Correlated failures
Shared ancestry means shared blind spots. Models with overlapping training data will hallucinate about the same obscure topics. Models built by the same researchers will have the same implicit assumptions about what "safety" means. Models with the same architecture will fail on the same adversarial inputs.
An audit that misses the same things you miss isn't an audit. It's a mirror.
The case for cross-model auditing (despite the incest)
I believe AI cross-auditing is still worth pursuing, but only if we're honest about its limits and design around them:
- Map independence before auditing. That's what this project is. Before trusting Model A to audit Model B, quantify their genealogical distance. The family tree is a prerequisite for meaningful oversight.
- Maximize diversity in audit panels. If you're going to use cross-model auditing, use models with the least shared ancestry. A Chinese model (DeepSeek, Qwen) auditing a Western model may catch things that another Western model wouldn't: different data, different researchers, different implicit assumptions about what matters.
- Ground audits in verifiable facts. Don't just ask "is this output good?" Ask "is this output true?" and check against external evidence. Cryptographic verification of factual claims against authoritative sources. I've been prototyping this approach.
- Don't let cross-auditing substitute for human oversight. AI auditing AI is a complement to human-led external oversight, not a replacement. The family tree shows why: the ecosystem is too inbred for any model to be truly independent.
Historical precedent
Every industry that touches public safety went through a version of this reckoning:
- Finance: Arthur Andersen audited Enron while consulting for Enron. The conflict of interest was invisible until $74 billion in shareholder value evaporated. Sarbanes-Oxley (2002) mandated auditor independence: you can't audit a company you also advise.
- Pharmaceuticals: Before the FDA, drug companies ran their own safety trials and published only favorable results. The 1962 Kefauver-Harris Amendment required independent evidence of efficacy, not just manufacturer assurances.
- Aviation: Boeing's 737 MAX crashes killed 346 people. A key factor: the FAA had delegated much of its safety certification to Boeing itself. The fox was guarding the henhouse.
AI is in the "pre-regulation" phase of this cycle. Every major lab grades its own homework. Model cards are self-reported. Safety benchmarks are often designed by the same organizations whose models they evaluate. The family tree shows that even "cross-model" evaluation doesn't solve this if the models share the same DNA.
For more on why external oversight matters, see Who Audits the Auditor.