Unsourced
Original Research

AI cites a lot of sources you can't actually check.

We put the 100 most common everyday questions to the 7 leading AI assistants and inspected every source they cited. A snapshot of how trustworthy AI's citations really are.

622 answers · 7 assistants · 100 prompts · 2,248 source URLs inspected · 2026-06-27 · every figure measured, none simulated

18.0%
of cited sources are dead links — gone, 404/DNS-fail (405 of 2,248)
83.9%
cross-assistant divergence — ask all 7 the same question, they overlap on a source only ~16% of the time
27.0%
more are live but block independent verification — a checker can't confirm them

What happens when you try to open what AI cites

Reachable 54.2% Blocked but live 27.0% Dead 18.0% Timeout 0.8%

Only 54.2% of the source URLs AI handed us opened cleanly to a live page. Verified twice — 18.0% were still dead on an independent re-check.

What this means — in plain English

When you ask ChatGPT, Gemini or Perplexity a question, it shows you a tidy list of “sources” so the answer looks trustworthy. We actually checked those sources. Here's the honest picture:

The takeaway: don't take an AI's sources at face value — click through and check. And if you run a business, simply being mentioned by AI isn't enough; what counts is whether your presence holds up when someone actually looks. That gap — proving it, not scoring it — is exactly what Unsourced exists to close.

The web AI leans on

Most-cited domains across all 3,758 citations. The top of the list is user-generated content and aggregators — not primary sources.

1. youtube.com243
2. reddit.com178
3. google.com161
4. mayoclinic.org78
5. healthline.com67
6. forbes.com64
7. health.harvard.edu64
8. pcmag.com52
9. cnet.com46
10. nytimes.com46
How we measured this. Each of 100 high-signal consumer queries was sent to all 7 assistants (OpenAI, ChatGPT Search, Claude, Gemini+Search, Grok, Llama, Perplexity) with a neutral “answer and list your sources” instruction. Every source URL was fetched once (GET, redirects followed) and bucketed reachable / blocked-but-live (403/429 — live, refuses automated checks, not counted dead) / dead (404, gone, DNS-fail) / timeout. Google's grounding redirect tokens (1,255) are an API artifact, not user-facing links — excluded from the dead-rate but resolved to their real domain for the other measures. Divergence = mean pairwise Jaccard of cited-domain sets across assistants per query. ChatGPT Search ran on 40 of the 100 queries — OpenAI's search API enforces a strict rate limit — while every other assistant ran all 100. Snapshot, 2026-06-27.
This corroborates a growing body of work. Columbia's Tow Center found AI search engines cited broken or fabricated URLs in over half of news tests; an independent academic study measured a 17% citation “phantom rate.” Our 18.0% — measured fresh on everyday consumer queries — lands in the same range, and adds the cross-assistant divergence finding.

© Unsourced — the evidence layer for AI search. Methodology and raw data available on request.