NeuralCrawl

California Institute of Technology / robots.txt snapshot

← back to caltech.edu · fetched 2026-06-26T14:15:22Z (4h ago) · HTTP 200 · 266 bytes · sha256 3f591e8c02a0b220 · raw

final URL: https://www.caltech.edu/robots.txt

1User-agent: SemrushBot
2Disallow: /
3
4User-agent: BLP_bbot
5Disallow: /
6
7User-agent: *
8Disallow: /campus-life-events/calendar/minicalendar/*
9Disallow: /map/landmark_ajax/*
10Disallow: /map/milestone/*
11Crawl-delay: 10
12Allow: *
13Sitemap: https://www.caltech.edu/sitemap.xml