Zyppy and Google Are Both Right About llms.txt
In early May 2026, one of the most careful research shops in SEO scored llms.txt a 2 out of 10. Days later, Google shipped an audit that treats it as part of agent-readiness. They aren't contradicting each other. They're answering two different questions — and the gap between those questions is the whole story.
The findings reconciled here come from Zyppy (the citation question) and Google's Chrome team (the agent question); the "three patterns" framing of how AI reads a site is Iurii Rogulia's. The reading that reconciles them is ours. Full sources are listed at the end.
The same week, two opposite verdicts
On 7 May 2026, Zyppy published a ranking of 23 factors that predict
whether AI systems cite a page, drawn from an analysis of 54 studies and
patents. It is one of the more rigorous surveys of actual AI-citation
behaviour available. The file called
llms.txt — the
structured summary you place at your domain root to tell AI systems what
your site is — scored a 2.0. The note attached to it was blunt: no
credible evidence it affects citations.
Within days, Google shipped Lighthouse 13.3 — its open-source, automated web-auditing tool, built into Chrome — with a new, experimental audit category called Agentic Browsing. One of the things it checks for is an llms.txt file at your domain root, alongside registered agent tools, accessibility labelling, and layout stability. Google did not score it 2.0. Google built it into an audit.
Google is not even consistent with itself here. Google Search's own guidance says you don't need llms.txt — or any other "special" markup — to appear in AI Overviews or AI Mode. Google's Chrome team audits it anyway. Same company, opposite advice, because the two teams are looking at two different things.
So a careful research shop says the file does nothing, and the company that runs the dominant search index both dismisses it and audits it, all inside a couple of weeks. The instinct is to decide who is wrong. That instinct is the mistake.
Two questions, asked by two different things
Before the file, sort out who is even asking. "AI systems" are doing two jobs here, and they're not the same.
Underneath both is the same kind of thing: a model — the trained system (GPT, Claude, Gemini) that predicts text. On its own, a model doesn't visit your site at all. What visits your site is a model that's been harnessed with extra tooling, packaged into a product, and branded — and two of those products matter here.
The first is the answer engine — ChatGPT, Claude, Gemini, Perplexity. This is what most people mean by "the AI": you ask a question, it returns an answer, and it may or may not quote your site as a source. The second is the agent — a model harnessed not to answer questions about your site but to act on it: open it, read its structure, click through it, complete a task. Same engine underneath, different chassis. One is trying to cite. The other is trying to operate.
Once you separate them, the two verdicts line up cleanly.
The citation question
Does having an llms.txt make an answer engine more likely to cite you — whether by being trained into the model or fetched live mid-answer? This is what Zyppy measured, and the evidence says no, or more precisely, no one has found any relationship yet. On this question, llms.txt earns its 2.0.
The agent question
Can an agent use llms.txt to understand what your site is and how to move through it? This is what Google's agentic audit checks, and here the file does real work. It is a map handed to something that is trying to operate your site, not just read it. On this question, llms.txt is worth auditing.
Different things, asking different questions, about different jobs. The file has been doing nothing for the first — at least with today's evidence — and is still given real weight for the second. llms.txt is exactly that file.
Why one file ends up with two jobs
It helps to remember that "AI reads your site" is three different events, not one. Iurii Rogulia's framing is the cleanest version of this. There is the training crawl that snapshots your site into a model. There is live inference, where a system retrieves a page mid-answer. And there are agents, autonomous systems that browse your site to get something done.
Citation lives in the first two. Whether an answer engine quotes you is decided by what was in the training data and what gets retrieved at query time, and the evidence says a summary file at your root doesn't move that. Operability lives in the third. A structured context file is precisely the thing an agent is built to consume: a machine-readable description of what the site is and where things are.
So of course llms.txt scores near zero on citation and shows up in an agent audit. Same file, two jobs, two audiences. The contradiction was never in the file. It was in treating "does llms.txt work?" as one question.
The verdicts reflect where the evidence stands now, not a permanent rule. If a major provider starts using llms.txt for retrieval or training, the top row could change — which is exactly why we keep them as separate questions instead of collapsing them into one score.
The part that's easy to miss
Here is the load-bearing point, and it is bigger than a caveat. A citation study, however rigorous, tells you nothing about the agent question — because it never measured it. Zyppy answered one question carefully and didn't answer the other one at all. That isn't a flaw in the research. It's the nature of measurement: a number only reports on what it counted. Reach for that number to settle the agent question and you are reading a thermometer to find out the time.
There is a genuine, smaller caveat underneath it. Even on its own question, the citation finding is correlational. Zyppy measured what AI systems actually cite and found no relationship, which is strong evidence of absence, not proof that none could ever exist. Worth naming. But it's the footnote. The headline is that no citation study, perfect or not, speaks to whether an agent can operate your site.
This is why "+3 points, you have an llms.txt" isn't really analysis at all. It merges two questions into one number. It is a presence check wearing the costume of analysis, and it can't tell you the file does nothing for citations and something for agents, because it never asked which question it was answering.
You can dig the well; you can't make them drink
There's a second nuance that the "it works / it doesn't" fight skips entirely. An llms.txt makes your site legible to agents. It does not make any agent read it. You can provide AI systems with your llms.txt; you cannot make them utilize it. Writing the file is the part you control. Whether anything uses it is the part you don't.
And "agents" is not one crowd. There are the agents that ship from the big model providers — the browsing modes inside ChatGPT, Claude, Gemini — which are most of today's agent traffic and are trained and steered by their makers. Whether they lean on llms.txt is the provider's call, and right now most don't: in a 30-day analysis of roughly a thousand domains, Iurii Rogulia logged zero requests for the file. No major provider reads it yet. Then there are user-controlled agents: the ones a company or developer builds and points at the web themselves. Those can be told to read your llms.txt first, in full, today.
So the payoff is real but uneven, and it moves. A custom agent can drink from your well right now. A big-provider agent might not until its makers decide it should. You are digging the well ahead of the demand, which is exactly when it's cheap to dig.
What to actually do with your llms.txt
The practical answer is unglamorous: write one. It costs about fifteen minutes. The downside is zero — it doesn't touch your crawl budget, your Core Web Vitals, or any existing SEO signal. It sits at the bottom of the priority list, below clean structured data and semantic HTML, but it belongs on the list. Optimisation, not necessity.
Just add it for the right reason. Not because it will get you cited — the evidence says it probably won't, at least not yet. If you add it expecting citations and don't see them, you'll conclude the whole discipline is a fraud, when all you did was watch the wrong number. Add it because you're digging the well for the agents that are already thirsty, the rest are coming.
The mistake was never adding llms.txt. The mistake is adding it, watching the citation number, and drawing a conclusion that number was never qualified to support.
Why we won't hand you one llms.txt score
This is the discipline our diagnostic is built around. We don't reduce llms.txt to "present: yes, +3." We read it as two separate questions — citation posture and agent-operability — and we keep them apart, because the honest answer to "is my llms.txt doing anything?" depends entirely on which one you're asking. Yes on one, not really on the other, and a single number can't say that without lying.
Concretely: in our report you'll see llms.txt twice — once under our AEO scoring as a citation-posture signal, and once under our Agentic scoring as an agent-operability signal. Same file, two scores, two readings.
Keeping the questions apart isn't a scoring detail. It's the job. The whole reason forensic infrastructure analysis exists is to stop you optimising one outcome while a different mechanism is quietly the one that decides the result.
Sources
- Zyppy — 23 AI citation ranking factors — the citation question; llms.txt scored 2.0.
- Google Search Central — Optimizing for generative AI features on Google Search — Google's own guidance that you don't need llms.txt for AI Overviews or AI Mode.
- Google Chrome for Developers — Lighthouse Agentic Browsing scoring — the agent question; llms.txt as an audited signal.
- Iurii Rogulia — llms.txt and AI discoverability — the three-patterns framing and the 30-day, ~1,000-domain zero-requests finding.
- Iurii Rogulia — the original LinkedIn post.
Further reading
The three disciplines of being found — get ranked, get into the answer, get recommended — and why the signals don't always overlap.
Why infrastructure that machines can read is the layer everything agentic depends on.
The agent question in full — what it takes for an autonomous system to read and work your site.
Find out which question your infrastructure is answering
The forensic diagnostic separates citation posture from agent-operability across ten pillars — so you optimise the signal that matters, for the job it matters. No sales call required.