Is Your Site Agent-Ready? How to Prepare for the AI Agent Web

We scanned 100,000 of the web's most-visited websites for “agent readiness” - how well sites support AI agents that browse, extract, and act on web content. The findings reveal a massive gap between what agents need and what most websites provide.

The numbers are striking. While basic web standards like robots.txt have broad adoption at 83%, the newer standards that AI agents rely on are almost nonexistent. Only 1.2% of sites support markdown content negotiation. Fewer than 9% have an llms.txt file. The web is not ready for agents - but the agents are already here.

What Does “Agent-Ready” Mean?

An agent-ready website is one that AI systems can efficiently discover, read, understand, and interact with. This goes beyond traditional SEO, which optimizes for search engine crawlers that index pages and rank them. Agent readiness optimizes for AI systems that consume content directly, make decisions based on it, and sometimes take actions on behalf of users.

Think of the difference this way: a search engine crawler reads your page, indexes keywords, and shows a link in results. An AI agent reads your page, understands its meaning, extracts specific facts, and uses those facts to answer a user's question - potentially without the user ever visiting your site. The agent needs your content to be structured, accessible, and machine-readable in ways that go far beyond what traditional SEO requires.

The State of Agent Readiness: Our Research

We scanned the top 100,000 websites on the internet, testing each one for 8 agent readiness standards. The results paint a clear picture of where the web stands today.

Adoption of Agent Standards Across Top 100K Domains

Source: MeasureBoard independent research, scan of 100,000 top websites, April 2026

robots.txt83.3%

Sitemap.xml59.9%

AI Bot Rules41%

Sitemap ref in robots.txt31.9%

JSON-LD Schema21.9%

Link Headers (RFC 8288)13.7%

llms.txt8.3%

Markdown Negotiation1.2%

LLMS: directive<1%

Legacy web standards (robots.txt, sitemaps) are well-adopted. Agent-specific standards like markdown negotiation and LLMS directives remain below 5%.

The gap between legacy web standards and agent-specific standards is enormous. robots.txt has been around since 1994 and sits at 83% adoption. Markdown content negotiation, which was formalized in late 2025, is at just 1.2%. Even llms.txt - a relatively simple text file - is present on fewer than 9% of sites. While 41% of sites have some form of AI bot rules in their robots.txt, the vast majority are blocking crawlers like CCBot and Bytespider rather than explicitly welcoming AI agents. This is not surprising - new standards always lag - but it represents a window of opportunity for sites that adopt early.

The Six Checks That Matter

Based on our analysis, agent readiness breaks down into six concrete, testable checks. MeasureBoard's GEO Readiness scanner now tests all six automatically.

1. robots.txt with AI bot rules

Having a robots.txt file is table stakes. What matters for agent readiness is whether your robots.txt explicitly addresses AI crawlers. The major AI bots - GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, Google-Extended, Applebot-Extended - each have their own User-agent string. If your robots.txt only has rules for Googlebot and a wildcard * rule, you have no explicit policy for AI agents.

This matters because some sites block AI crawlers without realizing it (overly restrictive wildcard rules), and others allow them without knowing it (no policy at all). An intentional, explicit policy - even if it is Allow: / for each bot - signals that you have considered AI crawlers and made a deliberate choice.

2. Sitemap with robots.txt reference

A sitemap.xml helps any crawler discover your pages. For agent readiness, the sitemap should also be referenced in your robots.txt via a Sitemap: directive. This is the standard discovery mechanism that both search engines and AI crawlers use.

3. llms.txt and LLMS directive

The llms.txt standard gives AI models a structured overview of your site - what it is, what its key pages are, and how to navigate it. Think of it as a README for AI crawlers. The LLMS: directive in robots.txt (analogous to the Sitemap: directive) tells crawlers where to find it.

Adoption is still below 2% across top domains. Early adopters are disproportionately tech companies and publishers who understand that giving AI models a curated map of their content increases the chance of accurate citation.

4. Link headers (RFC 8288)

RFC 8288 defines Link response headers that point agents to useful resources. For example:

Link: </llms.txt>; rel="service-doc"; type="text/plain"
Link: </sitemap.xml>; rel="sitemap"; type="application/xml"

These headers let agents discover your llms.txt and sitemap without parsing HTML. Adoption sits at roughly 2% - most sites have never considered adding metadata to HTTP response headers beyond caching and security directives.

5. Markdown for Agents

When an AI agent requests a web page, it typically gets the full HTML response - navigation bars, JavaScript bundles, cookie banners, footer links, and the actual content buried somewhere in between. Markdown content negotiation lets agents request a clean, token-efficient version of the page by sending an Accept: text/markdown header.

Token Efficiency: HTML vs Markdown

Typical page before and after markdown content negotiation

~4,200

tokens (HTML response)

Includes nav, scripts, styles, footer

→

↓

~840

tokens (markdown response)

Clean content only

Up to 80% token reduction

Serving markdown instead of HTML can reduce token consumption by 80% or more, significantly reducing costs for AI providers and improving response speed.

The efficiency gains are substantial. Serving markdown instead of full HTML can reduce token consumption by up to 80% - stripping away navigation, scripts, styles, and boilerplate to leave just the content an agent actually needs. Fewer tokens means lower costs for the AI provider calling your site, faster responses for users, and a higher likelihood that your content fits within context windows.

As of early 2026, only 3 of 7 major coding agents (Claude Code, OpenCode, and Cursor) send the Accept: text/markdown header by default. But as the standard gains traction, more agents will adopt it - and sites that already support it will have a structural advantage.

6. Structured data (JSON-LD)

Structured data via JSON-LD schema markup has been a traditional SEO signal for years, but it takes on new importance for AI agents. JSON-LD gives machines a structured, unambiguous representation of your content - product details, article metadata, FAQ answers, organization info. AI models that encounter JSON-LD can extract facts with higher confidence than parsing prose.

At 44% adoption across top domains, structured data is the most-adopted agent standard beyond basic crawl controls. But “having JSON-LD” and “having comprehensive JSON-LD across all pages” are different things. Most sites have it on their homepage and nowhere else.

Agent Readiness Maturity Model

Level 0

Not Agent-Ready

No robots.txt, no sitemap, no structured data. AI agents struggle to crawl and understand the site.

Level 1

Crawl-Ready

robots.txt and sitemap present. AI crawlers can discover pages but must parse raw HTML.

Level 2

Structured

JSON-LD schema, llms.txt, and AI bot rules. Agents can understand content with confidence.

Level 3

Agent-Optimized

Markdown negotiation, Link headers, comprehensive schema. Site is purpose-built for AI agent consumption.

Most sites sit at Level 1. Moving to Level 2 requires a few hours of work. Level 3 is where the competitive advantage lives.

“Across the top 100,000 websites, only 1.2% support markdown content negotiation and fewer than 9% have an llms.txt file. Meanwhile, 83% have robots.txt and 60% have a sitemap - proving the infrastructure foundations exist but agent-specific standards have barely begun to spread. The adoption gap represents one of the largest untapped opportunities in web infrastructure.”
MeasureBoard ResearchAgent Readiness Study, April 2026·MeasureBoard 100K Domain Scan, April 2026

Why This Matters Now

AI agent traffic is growing faster than any previous category of web traffic. ChatGPT, Claude, Perplexity, and dozens of specialized agents are making millions of requests to websites every day - reading documentation, extracting product information, comparing prices, gathering facts for answers. This traffic will only accelerate as AI agents become embedded in browsers, operating systems, and enterprise workflows.

For business websites, the implications are direct:

AI citations drive traffic. When an AI model cites your site in a response, some percentage of users click through. The higher your agent readiness, the more likely AI models are to discover, understand, and cite your content accurately.

Token efficiency affects selection. AI agents have context window limits. If your page consumes 4,000 tokens of raw HTML when a competitor's page delivers the same information in 800 tokens of clean markdown, the agent is more likely to prefer the efficient source.

Early adoption compounds. AI models learn from patterns across the web. Sites that implement agent standards early contribute to training data that reinforces those standards. This creates a feedback loop where early adopters are better represented in future model behavior.

How to Check Your Agent Readiness Score

MeasureBoard's GEO Optimization suite now includes a full Agent Readiness assessment as part of the GEO Readiness Score. The scanner runs all six checks against your live site and scores each one pass/fail.

The Agent Readiness subscore makes up 20% of your overall GEO Readiness Score, reflecting its growing importance alongside traditional factors like content structure, schema coverage, and AI visibility.

You can check your score two ways:

Quick check (no signup): Use our free Agent Readiness Check tool to scan any domain instantly. Enter a URL and get pass/fail results for all 8 checks in seconds.

Ongoing monitoring: Create a free MeasureBoard account to get continuous agent readiness monitoring as part of the GEO Optimization suite, alongside AI rank tracking, search performance, site audits, and AI-powered recommendations.

How to Fix Each Check

robots.txt: Add AI bot rules

Add explicit User-agent directives for AI crawlers. A permissive policy looks like:

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

Sitemap: https://yoursite.com/sitemap.xml
LLMS: https://yoursite.com/llms.txt

For more on this topic, see our guide on configuring robots.txt for AI crawlers.

llms.txt: Generate and deploy

MeasureBoard's GEO suite includes an llms.txt generator that creates one automatically from your crawl data. Download the file and place it at your site root (/llms.txt). Then add the LLMS: directive to your robots.txt.

Link headers: Add to your server config

In Next.js, add Link headers in your middleware. In Nginx, add them to your server block. In Cloudflare, use Transform Rules. The format:

Link: </llms.txt>; rel="service-doc"; type="text/plain"
Link: </sitemap.xml>; rel="sitemap"; type="application/xml"

Markdown negotiation: Serve clean content

When a request arrives with Accept: text/markdown, return a markdown version of the page content instead of full HTML. The simplest implementation: serve your llms.txt as a markdown fallback. More advanced implementations convert each page to markdown on demand.

Set the response Content-Type: text/markdown; charset=utf-8 and optionally include an X-Markdown-Tokens header with the token count so agents know the response size before processing.

The Bottom Line

Agent readiness is where mobile responsiveness was in 2012. Everyone knows it matters. Our scan of the top 100,000 domains shows that only 1.2% support markdown negotiation and virtually none have an LLMS directive in their robots.txt. The standards are straightforward, the implementation effort is modest (a few hours for most sites), and the competitive window is wide open.

The sites that implement agent standards today will be the ones that AI models learn to prefer. By the time adoption reaches 50%, the advantage will be gone. Right now, it is still available.

Check your Agent Readiness score - it is free and takes 5 seconds. No signup required.

Is Your Site Agent-Ready?