๐จ๐ฆ McGill University
mcgill.ca · Universities · rank #39 · University · live robots.txt ↗
AI crawler access (latest snapshot, 26 min ago)
⛔blocked
restricted
✅allowed
faded = inherited from the * wildcard group
GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot
Current robots.txt 3386 bytes · sha256 34318cefd235 · raw
# # robots.txt # # This file is to prevent the crawling and indexing of certain parts # of your site by web crawlers and spiders run by sites like Yahoo! # and Google. By telling these "robots" where not to go on your site, # you save bandwidth and server resources. # # This file will be ignored unless it is at the root of your host: # Used: http://example.com/robots.txt # Ignored: http://example.com/site/robots.txt # # For more information about the robots.txt standard, see: # http://www.robotstxt.org/robotstxt.html Sitemap: https://www.mcgill.ca/root/sitemap-index.xml Sitemap: https://www.mcgill.ca/sitemap.xml User-agent: Lucidworks-Anda/2.0 Crawl-delay: 0 User-agent: Elastic-Crawler Crawl-delay: 0 User-agent: * Crawl-delay: 5 User-agent: archive.org_bot Allow: /study/* # CSS, JS, Images Allow: */misc/*.css$ Allow: */misc/*.css? Allow: */misc/*.js$ Allow: */misc/*.js? Allow: */misc/*.gif Allow: */misc/*.jpg Allow: */misc/*.jpeg Allow: */misc/*.png Allow: */modules/*.css$ Allow: */modules/*.css? Allow: */modules/*.js$ Allow: */modules/*.js? Allow: */modules/*.gif Allow: */modules/*.jpg Allow: */modules/*.jpeg Allow: */modules/*.png Allow: */profiles/*.css$ Allow: */profiles/*.css? Allow: */profiles/*.js$ Allow: */profiles/*.js? Allow: */profiles/*.gif Allow: */profiles/*.jpg Allow: */profiles/*.jpeg Allow: */profiles/*.png Allow: */themes/*.css$ Allow: */themes/*.css? Allow: */themes/*.js$ Allow: */themes/*.js? Allow: */themes/*.gif Allow: */themes/*.jpg Allow: */themes/*.jpeg Allow: */themes/*.png # Directories Disallow: */includes/ Disallow: */modules/ Disallow: */profiles/ Disallow: */scripts/ Disallow: */themes/ # eCals Disallow: /study/* # Files Disallow: */CHANGELOG.txt Disallow: */cron.php Disallow: */INSTALL.mysql.txt Disallow: */INSTALL.pgsql.txt Disallow: */INSTALL.sqlite.txt Disallow: */install.php Disallow: */INSTALL.txt Disallow: */LICENSE.txt Disallow: */MAINTAINERS.txt Disallow: */update.php Disallow: */UPGRADE.txt Disallow: */xmlrpc.php Disallow: */misc/favicon.ico # Paths (clean URLs) Disallow: */admin/ Disallow: */comment/reply/ Disallow: */filter/tips/ Disallow: */node/add/ Disallow: */search/ Disallow: /*/people/* Disallow: /*/events/* Disallow: /undergraduate-admissions/programs?* Disallow: /gradapplicants/programs?* Disallow: */user/register/ Disallow: */user/password/ Disallow: */user/login/ Disallow: */user/logout/ Disallow: */user # Paths (no clean URLs) Disallow: */?q=admin/ Disallow: */?q=comment/reply/ Disallow: */?q=filter/tips/ Disallow: */?q=node/add/ Disallow: */?q=search/ Disallow: /?q=study/*/courses/search Disallow: /?q=study/*/programs/search Disallow: /?q=study/*/search/all Disallow: */?q=user/password/ Disallow: */?q=user/register/ Disallow: */?q=user/login/ Disallow: */?q=user/logout/ Disallow: /*.zip$ Disallow: /*.gif$ Disallow: /*.jpg$ Disallow: /*.jpeg$ Disallow: /*.png$ Disallow: /*.tif$ Disallow: /*.tiff$ Disallow: /*.dll$ Disallow: /*.exe$ Disallow: /*.class$ Disallow: /*.wmv$ Disallow: /*.m4v$ Disallow: /*.jar$ Disallow: /*.gz$ Disallow: /*.tar$ Disallow: /*.css$ Disallow: /*.inc$ Disallow: /*.js$ Disallow: /*.js.php$ Disallow: /*.swf$ Disallow: /*.fla$ Disallow: /*.psd$ Disallow: /*.m4a$ Disallow: /*.m4p$ Disallow: /*.aac$ Disallow: /*.m2a$ Disallow: /*.m2v$ Disallow: /*.sit$ Disallow: /*.dmg$ Disallow: /*.wma$ Disallow: /*.mdb$ Disallow: /*.tar.gz2$ Disallow: /*.rar$
Change history
-
initial snapshot
- First snapshot of robots.txt archived