Globo / robots.txt snapshot
← back to globo.com · fetched 2026-06-26T13:23:15Z (3h ago) · HTTP 200 · 408 bytes · sha256 51949b387de79e71 · raw
final URL: https://www.globo.com/robots.txt
| 1 | # |
| 2 | # robots.txt |
| 3 | # |
| 4 | |
| 5 | User-Agent: * |
| 6 | Disallow: /busca/ |
| 7 | Disallow: /beta/ |
| 8 | Disallow: /historico-home/ |
| 9 | Disallow: *globo-cdn-src/* |
| 10 | Disallow: /alt-a/ |
| 11 | Disallow: /alt-b/ |
| 12 | Disallow: /alt-c/ |
| 13 | Disallow: /alt-d/ |
| 14 | Disallow: /recomendado/ |
| 15 | Disallow: /explore/ |
| 16 | Sitemap: http://www.globo.com/sitemap-image.xml |
| 17 | |
| 18 | |
| 19 | ###### |
| 20 | |
| 21 | User-agent: CCBot |
| 22 | Disallow: / |
| 23 | |
| 24 | User-agent: GPTBot |
| 25 | Disallow: / |
| 26 | |
| 27 | User-agent: Google-Extended |
| 28 | Disallow: / |
| 29 | |
| 30 | ###### |
| 31 |