NeuralCrawl

Costa Rica (Presidencia) / robots.txt snapshot

← back to presidencia.go.cr · fetched 2026-06-20T01:10:30Z (18h ago) · HTTP 200 · 3743 bytes · sha256 c48294001a13c338 · raw

final URL: https://presidencia.go.cr/robots.txt

1# As a condition of accessing this website, you agree to abide by the following
2# content signals:
3
4# (a) If a Content-Signal = yes, you may collect content for the corresponding
5# use.
6# (b) If a Content-Signal = no, you may not collect content for the
7# corresponding use.
8# (c) If the website operator does not include a Content-Signal for a
9# corresponding use, the website operator neither grants nor restricts
10# permission via Content-Signal with respect to the corresponding use.
11
12# The content signals and their meanings are:
13
14# search: building a search index and providing search results (e.g., returning
15# hyperlinks and short excerpts from your website's contents). Search does not
16# include providing AI-generated search summaries.
17# ai-input: inputting content into one or more AI models (e.g., retrieval
18# augmented generation, grounding, or other real-time taking of content for
19# generative AI search answers).
20# ai-train: training or fine-tuning AI models.
21
22# ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
23# RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT
24# AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.
25
26# BEGIN Cloudflare Managed content
27
28User-agent: *
29Content-Signal: search=yes,ai-train=no
30Allow: /
31
32User-agent: Amazonbot
33Disallow: /
34
35User-agent: Applebot-Extended
36Disallow: /
37
38User-agent: Bytespider
39Disallow: /
40
41User-agent: CCBot
42Disallow: /
43
44User-agent: ClaudeBot
45Disallow: /
46
47User-agent: CloudflareBrowserRenderingCrawler
48Disallow: /
49
50User-agent: Google-Extended
51Disallow: /
52
53User-agent: GPTBot
54Disallow: /
55
56User-agent: meta-externalagent
57Disallow: /
58
59# END Cloudflare Managed Content
60
61#
62# robots.txt
63#
64# This file is to prevent the crawling and indexing of certain parts
65# of your site by web crawlers and spiders run by sites like Yahoo!
66# and Google. By telling these "robots" where not to go on your site,
67# you save bandwidth and server resources.
68#
69# This file will be ignored unless it is at the root of your host:
70# Used: http://example.com/robots.txt
71# Ignored: http://example.com/site/robots.txt
72#
73# For more information about the robots.txt standard, see:
74# http://www.robotstxt.org/robotstxt.html
75
76User-agent: *
77# CSS, JS, Images
78Allow: /core/*.css$
79Allow: /core/*.css?
80Allow: /core/*.js$
81Allow: /core/*.js?
82Allow: /core/*.gif
83Allow: /core/*.jpg
84Allow: /core/*.jpeg
85Allow: /core/*.png
86Allow: /core/*.svg
87Allow: /profiles/*.css$
88Allow: /profiles/*.css?
89Allow: /profiles/*.js$
90Allow: /profiles/*.js?
91Allow: /profiles/*.gif
92Allow: /profiles/*.jpg
93Allow: /profiles/*.jpeg
94Allow: /profiles/*.png
95Allow: /profiles/*.svg
96# Directories
97Disallow: /core/
98Disallow: /profiles/
99# Files
100Disallow: /README.md
101Disallow: /composer/Metapackage/README.txt
102Disallow: /composer/Plugin/ProjectMessage/README.md
103Disallow: /composer/Plugin/Scaffold/README.md
104Disallow: /composer/Plugin/VendorHardening/README.txt
105Disallow: /composer/Template/README.txt
106Disallow: /modules/README.txt
107Disallow: /sites/README.txt
108Disallow: /themes/README.txt
109# Paths (clean URLs)
110Disallow: /admin/
111Disallow: /comment/reply/
112Disallow: /filter/tips
113Disallow: /node/add/
114Disallow: /search/
115Disallow: /user/register
116Disallow: /user/password
117Disallow: /user/login
118Disallow: /user/logout
119Disallow: /media/oembed
120Disallow: /*/media/oembed
121# Paths (no clean URLs)
122Disallow: /index.php/admin/
123Disallow: /index.php/comment/reply/
124Disallow: /index.php/filter/tips
125Disallow: /index.php/node/add/
126Disallow: /index.php/search/
127Disallow: /index.php/user/password
128Disallow: /index.php/user/register
129Disallow: /index.php/user/login
130Disallow: /index.php/user/logout
131Disallow: /index.php/media/oembed
132Disallow: /index.php/*/media/oembed