Empty Titles

An entire industry selling rank it hasn't earned, optimization that has failed every time it's been tried, and complexity on top of foundations it never bothered to get right.

April 4, 2026 · ~11 min read

Listen:

Synthesized from 12 source documents across the structural discipline, standards landscape, implementation patterns, and disruptive intermediaries research lenses. Source-reviewed, fact-reviewed, and gap-reviewed before publication.

Right-click your site. View Page Source. If your business name isn’t in the raw HTML — not the rendered page, not the inspector, the actual source code your server sends — you are invisible to the majority of AI crawlers.

A searchVIU analysis of 32 major AI crawlers found 69% cannot execute JavaScript at all. Writesonic independently tested six AI systems and confirmed: ChatGPT, Claude, and Gemini fetch raw HTML and convert it to Markdown. No JavaScript execution. No rendering. If your content lives inside a <div id="root"></div> waiting for a JavaScript bundle to populate it, three of the largest AI platforms see an empty shell.

The fix is server-side rendering. It has been the correct answer for every crawler since Googlebot. Vercel’s network data confirms GPTBot and ClaudeBot download JavaScript files in 11-24% of requests but execute none. An analysis of over 500 million GPTBot fetches found zero evidence of JavaScript execution.

We should have shipped this in 2005. That it wasn’t shipped — that the industry spent twenty years selling complexity instead of demanding foundations — is not an accident. It is the business model.

The con named

GEO. AEO. LLMO. LLM SEO. Four acronyms for the same dependency trap repackaged with new terminology.

The projected market for Generative Engine Optimization is $33.7 billion by 2034, according to Dimension Market Research — a third-party projection carrying the inherent uncertainty of any forecast for a market that barely exists. The industry selling that future is built on measurement sand. AirOps — a vendor operating in the AI visibility space, methodology undisclosed — found that only 30% of brands stay visible from one AI answer to the next for the same query. Across five consecutive runs, only 20%. AI Overview content changes roughly 70% of the time.

That is not a measurement problem. That is the measurement telling you there is nothing stable to optimize for.

The citation volatility means no technical change can be causally linked to AI citation rates. The surface is too unstable for attribution. A Search Engine Land analysis by Aimee Jurenka summarized the state of evidence in March 2026: “To date, there are no peer-reviewed studies on schema’s impact on AI search visibility, or controlled experiments on LLM citation behavior and schema markup.” None. The entire optimization industry is selling outcomes it cannot measure against standards that do not exist.

Superlines makes the “empty titles” concrete. They analyzed 34,234 AI responses across 10 platforms over 30 days in early 2026 — first-party data about their own brand, with disclosed methodology. Gemini cited their site 182 times in that period. Mentioned the brand name zero times. Seventy-three percent of their AI presence consisted of citations without brand mentions. You are being cited. You are not being seen. That is what the titles being sold are worth.

The llms.txt debacle

If you want to understand what the optimization industry sells, look at llms.txt.

The proposal — a Markdown file at /llms.txt designed to help language models navigate website content — was published in September 2024 by Jeremy Howard of Answer.AI. By late 2025, BuiltWith tracked over 844,000 websites with an llms.txt file in place. The industry promoted it. Agencies recommended it. Audit tools scored sites on whether they had one.

844,000 sites adopted it. Zero AI crawlers read it. Not one.

Google’s John Mueller compared it to the keywords meta tag — a standard so abused that Google deprecated it in 2009: “AFAIK none of the AI services have said they’re using LLMs.TXT (and you can tell when you look at your server logs that they don’t even check for it).” A 30-day audit of 1,000 Adobe Experience Manager domains found GPTBot, ClaudeBot, and PerplexityBot entirely absent from llms.txt requests. A 48-day server log analysis across 12,099 AI bot requests: zero requests for /llms.txt. SE Ranking, analyzing roughly 300,000 domains, found no statistical relationship between having an llms.txt file and being cited by LLMs. Their machine learning model performed better as a predictor when the llms.txt variable was removed entirely.

The industry sold compliance with a standard that doesn’t function as a standard. The people who bought it cannot tell the difference because nobody is measuring anything that matters.

Why the fundamentals are the fundamentals

There is a category of work that has survived every intermediary transition in the record. It is not new. It is not exciting. It is not what anyone is selling.

Semantic HTML

A <nav> element tells every machine consumer — search engine, screen reader, AI agent — that it contains navigation. A <div class="nav"> tells them nothing. The W3C’s ARIA in HTML specification is explicit: semantic elements carry implicit ARIA roles that browsers automatically expose in the accessibility tree. Those roles are what assistive technologies and AI agents consume.

This matters now because AI agents have converged on the accessibility tree as their primary interface for understanding web pages. OpenAI’s Atlas browser uses ARIA tags — the same labels and roles that support screen readers — to interpret page structure. Playwright MCP, which powers a growing number of AI agent frameworks, operates in snapshot mode by default, reading the browser’s accessibility tree rather than raw screenshots.

The A11y-CUA study — peer-reviewed, presented at CHI 2026, covering 60 tasks across 40.4 hours of interaction data — found AI agents completed tasks at a 78.3% success rate under standard conditions. Under accessibility-constrained conditions mimicking poor semantic structure, that dropped to 41.7%. The agents depend on the same structural cues that screen readers use.

Semantic HTML served screen readers in 2005. Search engines in 2015. AI agents in 2025. The consumers change. The discipline does not.

Server-side rendering

The rendering strategy determines whether AI crawlers see content or an empty shell. SSG and SSR both produce complete HTML in the server response — every AI crawler, regardless of JavaScript capability, receives the full page. Client-side rendering produces a <div id="root"></div> and a script tag. The business name, services, hours, location, products — everything rendered by JavaScript — is invisible to two-thirds of AI crawlers.

Cloudflare’s Markdown for Agents feature, which converts HTML to Markdown on the fly for AI consumers, strips <div> wrappers, nav bars, and script tags as elements with “zero semantic value” while preserving heading structure and meaningful content. The conversion pipeline — Cloudflare, which also authored the Web Bot Auth IETF draft and operates the dominant CDN-level bot verification infrastructure — produces up to 80% token reduction. Content that maps cleanly to Markdown survives. Content that depends on JavaScript rendering may not.

A site can rank position one on Google — which renders JavaScript — while being completely blank to every other AI system. The View Page Source test is the only test that matters.

Schema.org — with honest handling

Schema.org requires a more careful argument than the GEO vendors offer.

The honest state of the evidence: LLMs do not semantically parse JSON-LD. A controlled experiment by Mark Williams-Cook demonstrated that ChatGPT and Perplexity extracted data from entirely fabricated JSON-LD markup — proving they tokenize it as plain text, not as structured data. SearchVIU tested five AI systems on pricing data embedded exclusively in JSON-LD: Claude extracted 0% of the values. Gemini managed 50%. The Writesonic study scored JSON-LD at 0 out of 6 across all tested AI systems for direct detection.

So why does schema matter? Because its value is indirect and conditional.

First: Google and Microsoft have both confirmed they ingest JSON-LD during indexing and build entity representations in their knowledge graphs. Their AI features — AI Overviews, Copilot — draw on those indexed entities. This is the established pathway. It works through the knowledge graph, not through direct LLM parsing.

Second: the Growth Marshal study (n=730 AI citations across ChatGPT and Gemini, February 2026 — a vendor selling AI search services, but with disclosed methodology) found the critical variable is attribute richness, not mere presence. Attribute-rich Product and Review schema with populated pricing, ratings, and specifications achieved a 61.7% citation rate. Generic schema — Article, Organization, BreadcrumbList with sparse attributes — scored 41.6%. Pages with no schema at all: 59.8%.

Read that again. Generic schema performed worse than having no schema. Half-built markup degrades comprehension. The industry’s default recommendation — “add schema markup” — is actively harmful when the implementation is thin.

Third: an emerging pathway. Volpini et al. (March 2026 preprint — note the lead author is CEO of WordLift, a commercial Schema.org platform) found Schema.org combined with entity URIs improved RAG retrieval accuracy by 29.6% in agentic pipelines. This is early evidence, one research group, pending independent replication. But the direction suggests schema may transition from indirect utility to direct retrieval layer as agentic systems mature.

The conviction is not that schema works the way the GEO vendors claim. It is that attribute-rich structured data has survived every intermediary transition because the discipline it enforces — forcing content into complete, well-defined structures — makes content machine-legible regardless of the parsing mechanism. Generic schema is worse than nothing. Attribute-rich schema is the only implementation worth the effort.

The state of the web

The WebAIM Million 2026 report — an independent nonprofit analyzing the top one million home pages — documents the foundations the industry skipped:

46.1% of pages lack a <main> element. The single most important signal telling any machine consumer where the primary content begins and ends. Absent on nearly half the web.

41.8% skip heading levels. Jumping from <h2> to <h4> without an intervening <h3> breaks the hierarchical model that both screen readers and AI systems use to understand content relationships.

Pages with ARIA attributes present averaged 59.1 accessibility errors. Pages without ARIA: 42. More ARIA correlates with more errors because developers use it as a patch over non-semantic markup rather than fixing the underlying structure. Adrian Roselli — when OpenAI recommended adding ARIA tags for Atlas compatibility — identified the core problem: optimizing for AI agents by layering ARIA onto non-semantic markup is precisely backwards.

The industry that cannot ship a semantic <main> tag on half its websites is selling AI optimization services.

The accessibility-to-AI-readiness overlap is real but overstated. A criterion-by-criterion analysis of WCAG 2.2’s 56 Level A and AA success criteria found 29% transfer directly to AI agent comprehension — the criteria that require programmatically determinable structure, heading hierarchy, form labels, name/role/value. Another 20% transfer partially. The widely cited “70% overlap” claim — originating from Guillaume Laforge, Field CTO at Upsun — is an author’s estimate with no study behind it. The actual number is 29% direct, 49% including partial. Still substantial. Still less than half.

The curb cut effect applies: work done for accessibility benefits AI readability. But the honest framing is that accessibility builds the foundation. It does not build the whole house.

The caveat that governs everything

No peer-reviewed study establishes a causal link between any specific technical change and AI citation outcomes.

The 30% citation consistency rate — across platforms, across queries, across consecutive runs — means the measurement surface is too volatile for causal attribution. You cannot run an A/B test when the control group’s results change 70% of the time. Every vendor study claiming “our technique increased AI citations by X%” is confounded by a measurement surface that makes isolated attribution impossible.

This is not a reason to skip the work. It is the reason the work must be structural, not platform-specific.

The Search Atlas study (December 2025) found no correlation between schema markup coverage and citation rates across OpenAI, Gemini, and Perplexity. Visibility distributions were nearly identical across all adoption categories. The study measured schema presence rather than quality — but the null finding reinforces the point: presence alone does nothing. The Growth Marshal data tells you attribute richness is the only variable that moves the needle. Even there, domain authority reduces citation odds by 24% per rank position drop — dwarfing any schema signal.

The two studies that found AI-referred traffic converts better both come from vendors measuring small samples. The most rigorous study — Amsive, the only one applying proper inferential statistics across 54 websites — found no statistically significant difference. The p-value was 0.794. There is no signal there.

The honest answer is that nobody knows which technical change moves the needle for AI citations. The honest recommendation is that it does not matter. Do the structural work because it is the right work. Semantic HTML, server-side rendering, attribute-rich schema, clean heading hierarchy, descriptive link text, proper landmark roles. It has been the right work for twenty years. The people who skipped it are the ones selling you something new to skip it with.

The cost of skipping it is what changed.

The forms are old. The discipline is older than any platform consuming it. Semantic HTML predates Google. Structured data predates AI Overviews. Server-side rendering predates the JavaScript frameworks that made it optional.

The titles being sold — GEO, AEO, LLMO, whatever acronym the industry mints next quarter — are empty. Built on citation volatility so severe that 70% of what you see one minute disappears the next. Built on compliance with standards nobody reads. Built on complexity layered over foundations the industry never bothered to get right.

The ground underneath — the fundamentals, the discipline, the structural work that has held through every intermediary transition in the record — is the only ground worth standing on. It survived the rise of search engines. It survived the zero-click era. It will survive whatever comes next, because it was never optimized for any specific intermediary. It was built for machines that read structure. The machines changed. The structure did not.

If you want the specifics for your situation — which disciplines apply to your stack, your content type, your audience — The Quartermaster has a tool fitted to your hand. This piece has the conviction for why you should care.