NeuralCrawl

Cloudflare / robots.txt snapshot

← back to cloudflare.com · fetched 2026-06-20T14:27:25Z (7h ago) · HTTP 200 · 1097 bytes · sha256 c6e360cc6e931ea3 · raw

final URL: https://www.cloudflare.com/robots.txt

1# Robots.txt for www.cloudflare.com
2
3User-agent: *
4Allow: /
5
6# Sitemap
7Sitemap: https://www.cloudflare.com/sitemap.xml
8
9# AI/LLM friendly content
10# See https://llmstxt.org for the llms.txt specification
11# llms.txt provides curated content for AI assistants and LLMs
12
13# Allow AI crawlers to access markdown versions of pages
14User-agent: GPTBot
15Allow: /
16
17User-agent: ChatGPT-User
18Allow: /
19
20User-agent: Google-Extended
21Allow: /
22
23User-agent: Anthropic-AI
24Allow: /
25
26User-agent: Claude-Web
27Allow: /
28
29User-agent: CCBot
30Allow: /
31
32User-agent: PerplexityBot
33Allow: /
34
35User-agent: Cohere-ai
36Allow: /
37
38# Content Signals — declare AI content usage preferences
39# See https://contentsignals.org/ and https://datatracker.ietf.org/doc/draft-romm-aipref-contentsignals/
40Content-Signal: ai-train=yes, search=yes, ai-input=yes
41
42# AI-friendly content locations
43# - /llms.txt - Curated overview for AI/LLMs (markdown)
44# - /llms-full.txt - Full expanded content for larger context windows
45# - /*.md - Markdown versions of all pages (append .md to any URL)
46# - /.well-known/agents.json - Agent discovery and capabilities