NeuralCrawl

McGill University / robots.txt snapshot

← back to mcgill.ca · fetched 2026-06-26T16:59:04Z (1h ago) · HTTP 200 · 3386 bytes · sha256 34318cefd2355fc2 · raw

final URL: https://www.mcgill.ca/robots.txt

1#
2# robots.txt
3#
4# This file is to prevent the crawling and indexing of certain parts
5# of your site by web crawlers and spiders run by sites like Yahoo!
6# and Google. By telling these "robots" where not to go on your site,
7# you save bandwidth and server resources.
8#
9# This file will be ignored unless it is at the root of your host:
10# Used: http://example.com/robots.txt
11# Ignored: http://example.com/site/robots.txt
12#
13# For more information about the robots.txt standard, see:
14# http://www.robotstxt.org/robotstxt.html
15
16Sitemap: https://www.mcgill.ca/root/sitemap-index.xml
17Sitemap: https://www.mcgill.ca/sitemap.xml
18
19User-agent: Lucidworks-Anda/2.0
20Crawl-delay: 0
21
22User-agent: Elastic-Crawler
23Crawl-delay: 0
24
25User-agent: *
26Crawl-delay: 5
27
28User-agent: archive.org_bot
29Allow: /study/*
30
31# CSS, JS, Images
32Allow: */misc/*.css$
33Allow: */misc/*.css?
34Allow: */misc/*.js$
35Allow: */misc/*.js?
36Allow: */misc/*.gif
37Allow: */misc/*.jpg
38Allow: */misc/*.jpeg
39Allow: */misc/*.png
40Allow: */modules/*.css$
41Allow: */modules/*.css?
42Allow: */modules/*.js$
43Allow: */modules/*.js?
44Allow: */modules/*.gif
45Allow: */modules/*.jpg
46Allow: */modules/*.jpeg
47Allow: */modules/*.png
48Allow: */profiles/*.css$
49Allow: */profiles/*.css?
50Allow: */profiles/*.js$
51Allow: */profiles/*.js?
52Allow: */profiles/*.gif
53Allow: */profiles/*.jpg
54Allow: */profiles/*.jpeg
55Allow: */profiles/*.png
56Allow: */themes/*.css$
57Allow: */themes/*.css?
58Allow: */themes/*.js$
59Allow: */themes/*.js?
60Allow: */themes/*.gif
61Allow: */themes/*.jpg
62Allow: */themes/*.jpeg
63Allow: */themes/*.png
64# Directories
65Disallow: */includes/
66Disallow: */modules/
67Disallow: */profiles/
68Disallow: */scripts/
69Disallow: */themes/
70# eCals
71Disallow: /study/*
72# Files
73Disallow: */CHANGELOG.txt
74Disallow: */cron.php
75Disallow: */INSTALL.mysql.txt
76Disallow: */INSTALL.pgsql.txt
77Disallow: */INSTALL.sqlite.txt
78Disallow: */install.php
79Disallow: */INSTALL.txt
80Disallow: */LICENSE.txt
81Disallow: */MAINTAINERS.txt
82Disallow: */update.php
83Disallow: */UPGRADE.txt
84Disallow: */xmlrpc.php
85Disallow: */misc/favicon.ico
86# Paths (clean URLs)
87Disallow: */admin/
88Disallow: */comment/reply/
89Disallow: */filter/tips/
90Disallow: */node/add/
91Disallow: */search/
92Disallow: /*/people/*
93Disallow: /*/events/*
94Disallow: /undergraduate-admissions/programs?*
95Disallow: /gradapplicants/programs?*
96Disallow: */user/register/
97Disallow: */user/password/
98Disallow: */user/login/
99Disallow: */user/logout/
100Disallow: */user
101# Paths (no clean URLs)
102Disallow: */?q=admin/
103Disallow: */?q=comment/reply/
104Disallow: */?q=filter/tips/
105Disallow: */?q=node/add/
106Disallow: */?q=search/
107Disallow: /?q=study/*/courses/search
108Disallow: /?q=study/*/programs/search
109Disallow: /?q=study/*/search/all
110Disallow: */?q=user/password/
111Disallow: */?q=user/register/
112Disallow: */?q=user/login/
113Disallow: */?q=user/logout/
114Disallow: /*.zip$
115Disallow: /*.gif$
116Disallow: /*.jpg$
117Disallow: /*.jpeg$
118Disallow: /*.png$
119Disallow: /*.tif$
120Disallow: /*.tiff$
121Disallow: /*.dll$
122Disallow: /*.exe$
123Disallow: /*.class$
124Disallow: /*.wmv$
125Disallow: /*.m4v$
126Disallow: /*.jar$
127Disallow: /*.gz$
128Disallow: /*.tar$
129Disallow: /*.css$
130Disallow: /*.inc$
131Disallow: /*.js$
132Disallow: /*.js.php$
133Disallow: /*.swf$
134Disallow: /*.fla$
135Disallow: /*.psd$
136Disallow: /*.m4a$
137Disallow: /*.m4p$
138Disallow: /*.aac$
139Disallow: /*.m2a$
140Disallow: /*.m2v$
141Disallow: /*.sit$
142Disallow: /*.dmg$
143Disallow: /*.wma$
144Disallow: /*.mdb$
145Disallow: /*.tar.gz2$
146Disallow: /*.rar$