NeuralCrawl

CBS News / robots.txt snapshot

← back to cbsnews.com · fetched 2026-06-20T11:49:13Z (7h ago) · HTTP 200 · 950 bytes · sha256 3183f29f832838e6 · raw

final URL: manual:file

1# www.robotstxt.org/
2# www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156449
3
4User-agent: *
5
6Disallow: */search/*
7Disallow: /index.php/*
8Disallow: /election-results-data/*
9Disallow: /common
10Disallow: /htdocs/*
11Disallow: /user/login*
12Disallow: /app/?*
13Disallow: /network
14Disallow: /assets
15Disallow: /stories
16Disallow: /sections
17Disallow: */feed/
18
19# PER CBS-N ENG FINAL ROUTES DOC
20
21Disallow: /1318
22Disallow: /1319
23Disallow: /1328
24Disallow: /1344
25Disallow: /1770
26Disallow: /2240
27Disallow: /2300
28Disallow: /2303
29Disallow: /2304
30Disallow: /2305
31Disallow: /2306
32Disallow: /2307
33Disallow: /2308
34Disallow: /2332
35Disallow: /2741
36Disallow: /2994
37Disallow: /8301
38Disallow: /8324
39Disallow: /8601
40Disallow: /8614
41Disallow: /8618
42Disallow: /9742
43Disallow: /9744
44
45Sitemap: https://www.cbsnews.com/xml-sitemap/index.xml
46
47User-agent: GPTBot
48Disallow: /
49User-agent: MAZBot
50Disallow: /
51User-agent: panscient.com
52Disallow: /
53User-agent: proximic
54Disallow: