NeuralCrawl

Globo / robots.txt snapshot

← back to globo.com · fetched 2026-06-26T13:23:15Z (3h ago) · HTTP 200 · 408 bytes · sha256 51949b387de79e71 · raw

final URL: https://www.globo.com/robots.txt

1#
2# robots.txt
3#
4
5User-Agent: *
6Disallow: /busca/
7Disallow: /beta/
8Disallow: /historico-home/
9Disallow: *globo-cdn-src/*
10Disallow: /alt-a/
11Disallow: /alt-b/
12Disallow: /alt-c/
13Disallow: /alt-d/
14Disallow: /recomendado/
15Disallow: /explore/
16Sitemap: http://www.globo.com/sitemap-image.xml
17
18
19######
20
21User-agent: CCBot
22Disallow: /
23
24User-agent: GPTBot
25Disallow: /
26
27User-agent: Google-Extended
28Disallow: /
29
30######
31