NeuralCrawl

The Times / robots.txt snapshot

← back to thetimes.co.uk · fetched 2026-06-20T01:10:30Z (15h ago) · HTTP 200 · 1700 bytes · sha256 9d44c43a9e47fb7a · raw

final URL: https://www.thetimes.co.uk/robots.txt

1# This is the robots.txt file for thetimes.co.uk
2# The Times does not permit the unlicensed use of our content for large language models. Contact [email protected] for assistance.
3
4
5User-agent: *
6Disallow: /
7
8
9User-agent: Googlebot
10User-agent: Googlebot-Image
11User-agent: Googlebot-Video
12User-agent: Googlebot-News
13User-agent: Google-InspectionTool
14User-agent: GoogleOther
15User-agent: APIs-Google
16User-agent: Mediapartners-Google
17User-agent: DuckDuckBot
18User-agent: Slurp
19User-agent: Twitterbot
20User-agent: Bingbot
21User-agent: adidxbot
22User-agent: MicrosoftPreview
23User-agent: Applebot
24User-agent: AdsBot-Google
25User-agent: AmazonAdBot
26User-agent: facebookexternalhit
27User-agent: Yahoo Ad Monitoring
28User-agent: parse.ly scraper
29User-agent: Screaming Frog SEO Spider
30User-agent: SemrushBot
31User-agent: Botify
32User-agent: OnCrawl
33User-agent: Chrome-Lighthouse
34User-agent: Slackbot-LinkExpanding
35User-agent: Slack-ImgProxy
36User-agent: Slackbot
37User-agent: Snap URL Preview Service
38User-agent: proximic
39User-agent: GumGum-Bot
40User-agent: TTD-Content
41User-agent: Pubmatic
42User-agent: ias_crawler
43User-agent: outbrain
44User-agent: Leikibot
45User-agent: weborama-fetcher
46User-agent: AHC/2.1
47User-agent: SirdataBot
48User-agent: peer39_crawler/1.0
49User-agent: EchoboxBot/1.0
50User-agent: CriteoBot/0.1
51User-agent: AdDefend Site Context Crawler/1.0
52User-agent: SmartologyBot
53User-agent: Dianomi
54User-agent: ChatGPT-User
55User-agent: GPTBot
56User-agent: OAI-SearchBot
57
58
59Allow: /
60
61Disallow: /api/
62Disallow: /archive/page/
63Disallow: /archive/article/
64Disallow: /archive/find/
65Disallow: /tto/archive/article/
66Disallow: /tto/archive/find/
67Disallow: /tto/archive/frame/article/
68Disallow: /tto/archive/page/