NeuralCrawl

Veolia / robots.txt snapshot

← back to veolia.com · fetched 2026-06-20T01:10:30Z (15h ago) · HTTP 200 · 2190 bytes · sha256 3c71f3d95ef24964 · raw

final URL: https://www.veolia.com/robots.txt

1#
2# robots.txt
3#
4# This file is to prevent the crawling and indexing of certain parts
5# of your site by web crawlers and spiders run by sites like Yahoo!
6# and Google. By telling these "robots" where not to go on your site,
7# you save bandwidth and server resources.
8#
9# This file will be ignored unless it is at the root of your host:
10# Used: http://example.com/robots.txt
11# Ignored: http://example.com/site/robots.txt
12#
13# For more information about the robots.txt standard, see:
14# http://www.robotstxt.org/robotstxt.html
15
16# For SeekportBot
17User-agent: SeekportBot
18Crawl-delay: 4
19
20# For all agents
21User-agent: *
22# CSS, JS, Images
23Allow: /core/*.css$
24Allow: /core/*.css?
25Allow: /core/*.js$
26Allow: /core/*.js?
27Allow: /core/*.gif
28Allow: /core/*.jpg
29Allow: /core/*.jpeg
30Allow: /core/*.png
31Allow: /core/*.svg
32Allow: /profiles/*.css$
33Allow: /profiles/*.css?
34Allow: /profiles/*.js$
35Allow: /profiles/*.js?
36Allow: /profiles/*.gif
37Allow: /profiles/*.jpg
38Allow: /profiles/*.jpeg
39Allow: /profiles/*.png
40Allow: /profiles/*.svg
41# Directories
42Disallow: /core/
43Disallow: /profiles/
44# Files
45Disallow: /README.txt
46Disallow: /web.config
47# Paths (clean URLs)
48Disallow: /admin/
49Disallow: /*/admin/
50Disallow: /comment/reply/
51Disallow: /*/comment/reply/
52Disallow: /filter/tips/
53Disallow: /*/filter/tips/
54Disallow: /node/add/
55Disallow: /*/node/add/
56Disallow: /search/
57Disallow: /search$
58Disallow: /*/search$
59Disallow: /search?
60Disallow: /*/search?
61Disallow: /user/register/
62Disallow: /*/user/register/
63Disallow: /user/password/
64Disallow: /*/user/password
65Disallow: /*/user/password/werty
66Disallow: /user/login/
67Disallow: /*/user/login/
68Disallow: /user/logout/
69Disallow: /*/user/logout/
70Disallow: /*?f[*
71Disallow: /*&f[*
72# Paths (no clean URLs)
73Disallow: /index.php/admin/
74Disallow: /index.php/comment/reply/
75Disallow: /index.php/filter/tips/
76Disallow: /index.php/node/add/
77Disallow: /index.php/search/
78Disallow: /index.php/user/password/
79Disallow: /index.php/user/register/
80Disallow: /index.php/user/login/
81Disallow: /index.php/user/logout/
82Disallow: /staffordshire/
83Disallow: /southwark/
84Disallow: /leeds/
85Disallow: /termocdmx/
86Disallow: /santamarta/
87Disallow: /nordics/
88
89Disallow: */node/*
90Disallow: /node/*