NeuralCrawl

Ireland (Gov.ie) / robots.txt snapshot

← back to gov.ie · fetched 2026-06-24T10:04:25Z (1d ago) · HTTP 200 · 991 bytes · sha256 3f91b2d47ed913d0 · raw

final URL: https://www.gov.ie/robots.txt

1# Amazon AI
2User-agent: Amazonbot
3Disallow: /
4
5# Apple Web Crawler
6User-agent: Applebot
7User-agent: Applebot-Extended
8Disallow: /
9
10# Baidu
11User-agent: Baiduspider
12User-agent: Baiduspider-video
13User-agent: Baiduspider-image
14User-agent: ERNIEBot
15User-agent: YiyanBot
16Disallow: /
17
18# Anthropic AI
19User-agent: ClaudeBot
20Disallow: /
21
22# DataForSEO
23User-agent: DataForSeoBot
24Disallow: /
25
26# Google AI
27User-agent: Google-Extended
28Disallow: /
29
30# OpenAI
31User-agent: GPTBot
32Disallow: /
33
34# Meta (Facebook, Instagram, WhatsApp)
35User-agent: Meta-ExternalAgent
36Disallow: /
37
38# Majestic SEO
39User-agent: MJ12bot
40Disallow: /
41
42# Petal Search
43User-agent: PetalBot
44Disallow: /
45
46# SoGou
47User-agent: Sogou Spider
48Disallow: /
49
50# Yandex
51User-agent: Yandex
52Disallow: /
53
54User-agent: *
55Disallow: /documents/
56Disallow: /search/
57Disallow: /*/search/
58Disallow: /cuardaigh/
59Disallow: /*/cuardaigh/
60# Blocks numerous Irish search pages
61Disallow: /*?*
62Allow: /static/
63
64# Point to sitemap
65Sitemap: https://www.gov.ie/sitemap.xml