The Great Divergence

Executive Summary

The machines have moved in

Cloudflare sees over 200 trillion Internet interactions every single day. The data from Q1 2026 is unambiguous: we are witnessing the most significant platform shift in the history of the web — the transition from a human-centric Open Web to a machine-centric Programmable Web.

30.5%

of all web traffic is now automated

21.8%

of bot traffic is AI crawlers

+4.2pp vs Q4 2025

89.3%

of AI crawling is for training, not referral

31.4%

of all web crawlers are GPTBot or ClaudeBot

#8

ChatGPT's rank among all Internet services globally

↑ from outside top 10

4.3×

growth in Mirai botnet flood attacks QoQ

↑ threat level

>50%

of HTML page requests are non-human — AI bots and automated bots combined now account for the majority of the Internet

humans are no longer the primary client

The tipping point has arrived. Cloudflare data on HTML page requests by client type shows that AI bots and non-AI automated bots combined now account for more than 50% of all HTML page requests. For the first time in the web's history, machines outnumber people as the primary consumers of Internet content — not in aggregate data pipelines, but at the level of individual page loads.

Within bot traffic, the share attributable to AI crawlers grew 4.2 percentage points quarter-over-quarter, from 17.5% in Q4 2025 to 21.8% in Q1 2026. Meanwhile, traditional search engine crawling — the traffic that has historically returned value to publishers — declined 5.2 percentage points. The machines are not replacing search. They are replacing the deal.

"Bot activity has completely decoupled from human utility. The crawl-to-referral ratio has inverted. We're calling this the Great Divergence."

— Matthew Prince, CEO, Cloudflare

Section 1

The fundamental collapse is not hard to find in the data. The bot category breakdown tells the story of a web that has pivoted away from human utility:

Bot Traffic by Category — Q1 2026 vs Q4 2025

Bot Category Distribution (% of all verified bot traffic)

Category	Q4 2025	Q1 2026	Change
Search Engine Crawler	36.3%	31.1%	▼ −5.2pp
AI Crawler	17.5%	21.8%	▲ +4.2pp
SEO & Analytics	12.2%	13.3%	▲ +1.1pp
Advertising & Marketing	13.0%	10.6%	▼ −2.4pp
Page Preview	6.6%	6.6%	→ flat
Monitoring & Analytics	3.6%	3.7%	▲ +0.1pp
AI Search	2.2%	3.2%	▲ +1.0pp
Webhooks	2.5%	3.1%	▲ +0.6pp
Aggregator	2.7%	2.5%	▼ −0.2pp

The Top Bot Operators

Among individual bot operators, Google still commands 35.5% of all verified bot traffic — but its share dropped 6.2 percentage points QoQ. OpenAI is now the #3 bot operator on the entire Internet, at 9.5%, and growing.

1 Google

35.5% ▼ −6.2pp

2 Meta

16.6% ▲ +2.5pp

3 OpenAI

9.5% ▲ +1.2pp

4 Microsoft

6.1% → flat

5 Amazon

3.5% ▲ +0.1pp

AI Services Are Now Internet Infrastructure

One measure of how deeply AI has embedded itself in the Internet: ChatGPT is the #8 most-trafficked service globally, ahead of Amazon Shopping, TikTok, and Netflix. It is the #12 most-visited domain in the world.

AI Services Category — Weekly Ranking (Oct 2025 – Mar 2026)

Note: Lower rank number = higher traffic. Y-axis inverted. Source: Cloudflare Radar Internet Services.

Key Finding

DeepSeek surged from #9–10 in the AI ranking to #4 in the single week of February 2, 2026 — immediately following global attention on DeepSeek-R1. A single model release can reshape the entire competitive landscape within days.

Section 2

The Asymmetry of Extraction

The core dysfunction of the current AI web is not that bots exist — it's that they consume without reciprocating. Among all AI bot crawl activity in Q1 2026:

44.6%

explicitly for model Training

44.7%

Mixed Purpose (includes training)

8.1%

for Search — the only use that returns traffic

0.44%

Undeclared — shadow scrapers with no stated purpose

AI Bot Crawl Intent — Q1 2026

89.3% of all AI crawler requests are extractive, consuming content to build models that may eventually route around the source entirely. Only 8.1% powers a search product that sends users back to original content.

The Individual Crawler Pecking Order

The individual crawler ranking reveals the precise hierarchy of extraction. These are the ten most active crawlers on the web, ranked by share of all crawler traffic:

#	Crawler	Operator	% of Crawlers
1	Googlebot	Google	23.7%
2	GPTBot	OpenAI	17.0%
3	ClaudeBot	Anthropic	14.4%
4	Meta-ExternalAgent	Meta	12.6%
5	BingBot	Microsoft	7.9%
6	Amazonbot	Amazon	5.3%
7	FacebookExternalHit	Meta	3.1%
8	YandexBot	Yandex	2.6%
9	PetalBot	Huawei	2.5%

The 31.4% Problem

GPTBot and ClaudeBot together account for 31.4% of all web crawler traffic — nearly a third of the entire crawler ecosystem — yet the referral products attached to them remain nascent. They consume at the scale of Google; they return traffic at a fraction of it.

Google's Structural Crawl Advantage

Googlebot's dominance in crawler traffic understates its true advantage. Cloudflare data shows Googlebot successfully accessed individual pages almost 1.7× more than ClaudeBot, 1.76× more than GPTBot, 3× more than Meta-ExternalAgent, and 3.26× more than Bingbot. The gap widens dramatically for smaller crawlers: Googlebot saw 14.9× more unique pages than Applebot, 167× more than PerplexityBot, and over 700× more than CCBot. Out of all sampled unique URLs on Cloudflare's network, Googlebot crawled roughly 8% of them.

Why Publishers Cannot Block Googlebot

This coverage asymmetry is self-reinforcing. Publishers are effectively unable to block Googlebot because Google's 90%+ share of the search market means blocking it destroys organic traffic. Yet that same Googlebot — used for search indexing — is also the mechanism through which Google populates AI Overviews and AI Mode with live content, returning little if any traffic to the source. Publishers face an impossible choice: allow full extraction, or disappear from search. No other AI company operates with this structural immunity. The UK's Competition and Markets Authority (CMA) designated Google as having Strategic Market Status in search precisely because of this dynamic, and opened a consultation in January 2026 on conduct requirements — including whether Googlebot should be split into separate crawlers for search indexing vs. AI grounding.

Section 3

The Industry Heatmap: What AI Is Actually Consuming

Cloudflare Radar data on AI bot vertical distribution reveals which industries are most exposed — and what kind of content they should be protecting.

AI Bot Crawl Activity by Vertical — Q1 2026 (% of total AI crawl requests)

What Each Vertical Should Protect

Vertical	AI Crawl Share	Primary Content at Risk
Shopping & Retail	31.1%	Product descriptions, pricing, imagery, reviews
Internet & Telecom	16.7%	Infrastructure docs, API references, technical content
Computer & Electronics	14.9%	Technical documentation, code, specs
News, Media & Publications	9.2%	Articles, analysis, journalism — highest text quality per request
Business & Industry	5.1%	B2B content, industry reports, market data
Travel & Tourism	4.0%	Listings, reviews, itineraries, pricing
Finance	2.9%	Market data, financial reporting, disclosures

Shopping and retail is the most-crawled category at 31.1% — not because consumers are using AI to shop (yet), but because product data is rich training corpus for AI models that aim to assist with purchasing decisions. The crawl happens long before any agentic commerce product launches.

News and Media accounts for 9.2% of AI crawl activity — a figure that understates the impact. A retail product description is one sentence. A news article is thousands of words of structured, fact-checked prose. The training value per crawl request is not equal.

The Content Type Dimension

HTML/Text — publishers, news, blogs. Language models want words. Product Data & Images — e-commerce. Vision models and shopping agents want catalog data. Video — entertainment platforms. ByteSpider (ByteDance) is the 7th-largest AI user agent at 3.4% of AI bot traffic. Code — developer platforms. GitHub Copilot ranked #6 in the global AI services category.

Section 4

From Robots.txt to Honest Bot Detection

Publishers are responding with the only tool most of them have: a 47-year-old text file. GPTBot and ClaudeBot are the most restricted crawlers in the robots.txt ecosystem. In Cloudflare's verified bot catalog, 37 of 200 tracked bots are now AI crawlers or AI assistants — a category that barely existed two years ago.

Why Robots.txt Is Not Enough

First: compliance is voluntary. Our AI crawl timeseries shows that AI bot traffic hit its single-day peak of Q1 2026 on February 9 — even as blocking rates have been rising throughout the quarter. Volume is growing faster than restrictions can contain it.

Second: shadow scrapers exist. Our data shows 0.44% of AI bot crawl activity comes from bots declaring no purpose — Undeclared crawlers operating without transparency. These are bots that deliberately misidentify themselves, wearing a Googlebot mask to crawl as if they have permission they have not sought.

Third: the past is already gone. Models trained on data crawled before publishers erected their defenses have already consumed that content. The window of prevention is partly closed.

Fourth: blocking is not neutral. Publishers overwhelmingly target AI crawlers — not Googlebot. Among Cloudflare customers using AI Crawl Control between July 2025 and January 2026, the number of websites actively blocking crawlers like GPTBot and ClaudeBot was nearly 7× higher than those blocking Googlebot. The asymmetry is not ignorance — it is economic coercion. Publishers cannot afford to disappear from Google Search, so they absorb the extraction.

The Crawler Separation Debate

The UK's CMA, in its January 2026 consultation on Google's conduct requirements, identified this dynamic explicitly: publishers "have no realistic option but to allow their content to be crawled for Google's general search." The CMA proposed publisher opt-out controls, but Cloudflare — along with major UK publishers including the Daily Mail Group, the Guardian, and the News Media Association — has argued the proposal does not go far enough. The only structurally effective remedy is mandatory crawler separation: requiring Googlebot to split into distinct crawlers for search indexing vs. AI grounding, so publishers can allow one while blocking the other. Unlike other AI operators (OpenAI and Anthropic each operate purpose-specific crawlers), Google uses a single dual-purpose bot — giving it access no competitor can match.

Web Bot Auth: Cryptographic Verification

This is why Cloudflare built Web Bot Auth — a new standard using cryptographic signatures to verify a bot's identity at the protocol level. If a bot claims to be Googlebot, it signs its requests with a key only Google holds. If the signature fails, the claim is false. Web Bot Auth does not rely on bots choosing to be honest; it makes dishonesty technically detectable.

The 0.16% Signal

AI agents — bots that actively browse, fill forms, and execute tasks — currently represent 0.16% of verified bot traffic. Small today, but the fastest-growing subcategory. These agents cannot be managed with robots.txt alone. Cryptographic identity is the necessary infrastructure for the Programmable Web.

Section 5

AI Search: A New Referral Ecosystem

The AI Search category — crawlers attached to AI-powered search products that do return traffic to original sources — is one of the most strategically important in the ecosystem. It grew 1.0 percentage point QoQ — the fastest-growing bot subcategory this quarter.

Operator	Share of AI Search Traffic	Product
Apple	58.8%	Applebot → Siri, Spotlight, Apple Intelligence
OpenAI	41.0%	OAI-SearchBot → ChatGPT Search
Cloudflare	0.13%	Cloudflare AI Search
Brave	0.07%	Brave Search AI

Apple is the dominant AI search crawler, driven by Applebot's role in powering Siri and Apple Intelligence. Apple's crawl activity was highly volatile in Q1 2026 — near zero on some days, spiking by late March — suggesting active product development. A company with 2.35 billion active devices and a first-party AI search product is a fundamentally different referral engine than anything that preceded it.

OAI-SearchBot spiked noticeably during peak AI news periods (Jan 22–24, Feb 9, Mar 10–11), suggesting its crawl intensity tracks closely with ChatGPT usage surges. As AI search products mature and publishers begin demanding referral as a condition of access, this is the category that matters most.

Section 6

The AI Agent Layer: From Crawlers to Action

The most nascent but strategically significant development is the emergence of AI assistants as a distinct traffic category. These are not crawlers passively indexing content — they are agents actively executing tasks: browsing pages, filling forms, clicking buttons, completing purchases.

Operator	Share of AI Assistant Traffic
OpenAI	86.0%
Cloudflare (Browser Rendering)	10.7%
DuckDuckGo	2.5%
Mistral AI	0.26%
Meta	0.25%
Manus	0.22%
Amazon	0.03%
Devin AI	0.03%

The 21 verified AI assistant bots in our catalog include Amazon Bedrock AgentCore Browser (deployed across 9 AWS regions), Amazon's "Buy For Me" agent, Devin AI, CartAI, Apify, and Browserbase.

The Zero-Click Horizon

An agent that can complete a purchase does not need a publisher to survive — it disintermediates the human who would have visited the page, clicked the link, and converted. The "Zero-Click" future is not a search phenomenon alone. It is an agent phenomenon.

Section 7

Security: Automated Aggression at Scale

The Programmable Web has a kinetic dimension that goes beyond data extraction. Q1 2026 saw a pronounced acceleration in application-layer attack volume, with the quarter's peak occurring on March 22 — the most intense single day of L7 attacks in our records. The final two weeks of March were the most sustained attack cluster of the quarter.

5,376

DDoS attacks mitigated every single hour

87.0%

of L3 attacks lasted over 3 hours

4.3×

Mirai botnet flood growth, QoQ

7.9% → 34.4% of L3 attacks

48.1%

of L7 attacks targeted the United States

Up from 39.6% in Q4 2025

The Mirai Resurgence

At the network layer, Q1 2026 saw a dramatic shift in attack composition directly attributable to botnet evolution:

L3 Attack Vector Shift — Q4 2025 vs Q1 2026

The Aisuru-Kimwolf Fingerprint

Mirai-family botnet floods grew from 7.9% to 34.4% of all network-layer attack traffic — a 4.3× increase in a single quarter. This is the clearest data fingerprint of the Aisuru-Kimwolf botnet: a coordinated army of 1–4 million malware-infected Android TV devices that produced the world-record 31.4 Tbps attack in late 2025. The botnet is still active, still expanding.

L7 Attack Composition — Q1 2026

Mitigation Product	Q4 2025	Q1 2026	Change
WAF	51.1%	51.4%	▲ +0.25pp
DDoS Protection	43.5%	44.3%	▲ +0.77pp
Access Rules	3.0%	1.9%	▼ −1.1pp
Bot Management	0.47%	0.59%	▲ +0.11pp

Q1 2026: Major Internet Disruptions

Our traffic anomaly data logged 100 verified disruption events in Q1 2026. The most significant:

! Iran — Mass Multi-ISP Shutdown
Feb 28, 07:00–07:15 UTC · 7+ ISPs offline within 15 minutes CRITICAL

! Cuba — Three Rolling Outages
Mar 4–5, 16–17, 21–22 · ~23h each · ETECSA + country-level HIGH

! DR Congo — Extended 50-Hour Outage
Mar 15–17 · Country-level, 5 total disruptions Q1 HIGH

– Iran — Extended Multi-ISP Disruptions
Jan 12–26 continuous; Mar 18 country-level (9.5h); Mar 20 TCI outage HIGH

Section 8

The AI Inference Economy

Beyond what bots consume, Cloudflare's AI inference data reveals what developers are building on top of AI infrastructure in Q1 2026.

Workers AI — Task Type Distribution, Q1 2026

Top Models by Deployment

#	Model	Share of Inference Accounts
1	Meta Llama 3 8B Instruct	40.6%
2	Stable Diffusion XL 1.0	13.4%
3	Whisper (OpenAI)	8.3%
4	Meta Llama 4 Scout 17B	7.0%
5	M2M-100 1.2B (Translation)	5.4%

Text generation dominates at 62.9% of all AI inference, confirming that the primary use of AI infrastructure is language. Meta's Llama 3 is the most widely deployed model on Cloudflare's network at 40.6% of all inference accounts — Meta's open-weight strategy has produced the Internet's most-used inference model. The newly released Llama 4 Scout already accounts for 7.0% of inference accounts despite launching during the quarter — the fastest adoption curve we have observed for any new model.

Conclusion

Restoring the Balance

The data from Q1 2026 tells a coherent story. The web's economic engine — the deal between creators and distributors — is under structural strain. The machines have arrived in force. They are consuming at scale, returning at a trickle, and showing no signs of slowing.

The Great Divergence is not a coming threat. It is the current state.

30.5%

of all web traffic is automated

+4.2pp

AI crawler share growth in a single quarter

89.3%

of AI crawling is extractive, not referral

31.4%

of all web crawlers are GPTBot or ClaudeBot

Cloudflare's position is unchanged: the free lunch era of AI training is over. We are giving publishers the tools to create scarcity — the only leverage that creates value in a world of infinite automated extraction. We are building the infrastructure for the Programmable Web to be governed: cryptographic bot identity via Web Bot Auth, granular access controls, and a marketplace where the savings AI companies realize from efficient access flow back to the creators who made the content worth consuming.

The time to act is now. Block your content. Set your price. Reclaim your content independence before silence is treated as permission.