NeuralCrawl

BBC / robots.txt snapshot

← back to bbc.com · fetched 2026-06-20T01:10:30Z (17h ago) · HTTP 200 · 5656 bytes · sha256 7a21cb855b2e58ae · raw

final URL: https://www.bbc.com/robots.txt

1
2# version: 36756f545af9144e59780f65727af4ba98093b3b
3# The BBC's Terms of Use: https://www.bbc.co.uk/terms
4# - Explain the rules for using our services
5# - Tell you what you can do with our content
6#
7# In short: Please use our site like a human, not a robot.
8# That means:
9# - No scraping, crawling, or systematic extraction of content
10# - No use of BBC content for training or fine-tuning AI models, including large language models (LLMs)
11# - No retrieval-augmented generation (RAG), AI-powered search, agentic AI or grounding using BBC content
12# - No creating datasets from BBC content
13# - No text and data mining (TDM) under Article 4 of the EU Directive on Copyright in the Digital Single Market
14# - No using BBC content to create summaries for your own use
15# - No business use without permission (details: https://www.bbc.co.uk/usingthebbc/terms/can-i-use-bbc-content-for-my-business/)
16# - The BBC reserves all rights in its content and expressly opts out of any statutory exceptions in any jurisdiction for text and data mining, as permitted by law
17
18# TL;DR: Browse, read, watch, enjoy - like a human.
19#
20
21# HTTPS www.bbc.com
22
23User-agent: *
24Sitemap: https://www.bbc.com/sitemaps/https-index-com-archive.xml
25Sitemap: https://www.bbc.com/sitemaps/https-index-com-news.xml
26Sitemap: https://www.bbc.com/sitemaps/https-index-com-archive_video.xml
27Sitemap: https://www.bbc.com/sitemaps/https-index-com-video.xml
28Sitemap: https://www.bbc.com/sitemaps/sitemap-com-ws-topics.xml
29Sitemap: https://www.bbc.com/sport/sitemap.xml
30Sitemap: https://www.bbc.com/sitemaps/sitemap-com-ws-topics.xml
31Sitemap: https://www.bbc.com/afrique/sitemap.xml
32Sitemap: https://www.bbc.com/arabic/sitemap.xml
33Sitemap: https://www.bbc.com/bengali/sitemap.xml
34Sitemap: https://www.bbc.com/burmese/sitemap.xml
35Sitemap: https://www.bbc.com/gahuza/sitemap.xml
36Sitemap: https://www.bbc.com/hausa/sitemap.xml
37Sitemap: https://www.bbc.com/hindi/sitemap.xml
38Sitemap: https://www.bbc.com/indonesia/sitemap.xml
39Sitemap: https://www.bbc.com/mundo/sitemap.xml
40Sitemap: https://www.bbc.com/pashto/sitemap.xml
41Sitemap: https://www.bbc.com/persian/sitemap.xml
42Sitemap: https://www.bbc.com/portuguese/sitemap.xml
43Sitemap: https://www.bbc.com/russian/sitemap.xml
44Sitemap: https://www.bbc.com/swahili/sitemap.xml
45Sitemap: https://www.bbc.com/tajik/sitemap.xml
46Sitemap: https://www.bbc.com/turkce/sitemap.xml
47Sitemap: https://www.bbc.com/ukchina/simp/sitemap.xml
48Sitemap: https://www.bbc.com/ukrainian/sitemap.xml
49Sitemap: https://www.bbc.com/urdu/sitemap.xml
50Sitemap: https://www.bbc.com/uzbek/sitemap.xml
51Sitemap: https://www.bbc.com/vietnamese/sitemap.xml
52Sitemap: https://www.bbc.com/zhongwen/simp/sitemap.xml
53Sitemap: https://www.bbc.com/zhongwen/trad/sitemap.xml
54Sitemap: https://www.bbc.com/bbcx/index_sitemap.xml
55Sitemap: https://www.bbc.com/bbcx/audio_archive_sitemap.xml
56Sitemap: https://www.bbc.com/bbcx/video_documentaries_sitemap.xml
57Sitemap: https://www.bbc.com/bbcx/content_index_sitemap.xml
58
59Disallow: /asset/
60Disallow: /backstage/bbc-login-help/
61Disallow: /backstage/bbc-login-help$
62Disallow: /bitesize/search$
63Disallow: /bitesize/search/
64Disallow: /bitesize/search?
65Disallow: /cbbc/search/
66Disallow: /cbbc/search$
67Disallow: /cbbc/search?
68Disallow: /cbeebies/search/
69Disallow: /cbeebies/search$
70Disallow: /cbeebies/search?
71Disallow: /chwilio/
72Disallow: /chwilio$
73Disallow: /chwilio?
74Disallow: /education/blocks$
75Disallow: /education/blocks/
76Disallow: /newsround
77Disallow: /search/
78Disallow: /search$
79Disallow: /search?
80Disallow: /food/favourites
81Disallow: /food/search*?*
82Disallow: /food/recipes/search*?*
83Disallow: /education/my$
84Disallow: /education/my/
85Disallow: /bitesize/my$
86Disallow: /bitesize/my/
87Disallow: /food/recipes/*/shopping-list
88Disallow: /food/menus/*/shopping-list
89Disallow: /news/0
90Disallow: /sport/alpha/
91Disallow: /ugc$
92Disallow: /ugc/
93Disallow: /ugcsupport$
94Disallow: /ugcsupport/
95Disallow: /userinfo/
96Disallow: /userinfo
97Disallow: /u5llnop$
98Disallow: /u5llnop/
99Disallow: /sounds/search$
100Disallow: /sounds/search/
101Disallow: /sounds/search?
102Disallow: /ws/includes
103Disallow: /radio/imda
104Disallow: /storyworks/preview/*
105Disallow: /rd/search$
106Disallow: /rd/search/
107Disallow: /rd/search?
108
109User-agent: Amazonbot
110Disallow: /
111
112User-agent: magpie-crawler
113Disallow: /
114
115User-agent: CCBot
116Disallow: /
117
118User-Agent: omgili
119Disallow: /
120
121User-Agent: omgilibot
122Disallow: /
123
124User-agent: Claude-Web
125Disallow: /
126
127User-agent: ClaudeBot
128Disallow: /
129
130User-agent: anthropic-ai
131Disallow: /
132
133User-agent: cohere-ai
134Disallow: /
135
136User-agent: Bytespider
137Disallow: /
138
139User-agent: PetalBot
140Disallow: /
141
142User-agent: Scrapy
143Disallow: /
144
145User-agent: Applebot-Extended
146Disallow: /
147
148User-agent: GPTBot
149Disallow: /
150
151User-agent: ChatGPT-User
152Disallow: /
153
154User-agent: Google-Extended
155Disallow: /
156
157User-Agent: PerplexityBot
158Disallow: /
159
160User-agent: Perplexity-User
161Disallow: /
162
163User-agent: Google-CloudVertexBot
164Disallow: /
165
166User-agent: meta-externalagent
167Disallow: /
168
169User-agent: OAI-SearchBot
170Disallow: /
171
172User-agent: YandexAdditional
173Disallow: /
174
175User-agent: YandexAdditionalBot
176Disallow: /
177
178User-agent: TurnitinBot
179Disallow: /
180
181User-agent: Brightbot
182Disallow: /
183
184User-agent: ApifyBot
185Disallow: /
186
187User-agent: ApifyWebsiteContentCrawler
188Disallow: /
189
190User-agent: Diffbot
191Disallow: /
192
193User-agent: Diffbot-User
194Disallow: /
195
196User-agent: ExaBot
197Disallow: /
198
199User-agent: TavilyBot
200Disallow: /
201
202User-agent: ShapBot
203Disallow: /
204
205User-agent: YouBot
206Disallow: /
207
208User-agent: FirecrawlAgent
209Disallow: /
210
211User-agent: Amzn-SearchBot
212Disallow: /
213
214User-agent: Amzn-User
215Disallow: /
216
217User-agent: ProRataInc
218Disallow: /
219
220User-agent: CloudflareBrowserRenderingCrawler
221Disallow: /
222
223User-agent: AhrefsBot
224Disallow: /