NeuralCrawl

DOAJ / robots.txt snapshot

← back to doaj.org · fetched 2026-06-26T14:15:22Z (4h ago) · HTTP 200 · 2033 bytes · sha256 57a168878bff2d1f · raw

final URL: https://doaj.org/robots.txt

1# As a condition of accessing this website, you agree to abide by the following
2# content signals:
3
4# (a) If a Content-Signal = yes, you may collect content for the corresponding
5# use.
6# (b) If a Content-Signal = no, you may not collect content for the
7# corresponding use.
8# (c) If the website operator does not include a Content-Signal for a
9# corresponding use, the website operator neither grants nor restricts
10# permission via Content-Signal with respect to the corresponding use.
11
12# The content signals and their meanings are:
13
14# search: building a search index and providing search results (e.g., returning
15# hyperlinks and short excerpts from your website's contents). Search does not
16# include providing AI-generated search summaries.
17# ai-input: inputting content into one or more AI models (e.g., retrieval
18# augmented generation, grounding, or other real-time taking of content for
19# generative AI search answers).
20# ai-train: training or fine-tuning AI models.
21
22# ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
23# RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT
24# AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.
25
26# BEGIN Cloudflare Managed content
27
28User-agent: *
29Content-Signal: search=yes,ai-train=no
30Allow: /
31
32User-agent: Amazonbot
33Disallow: /
34
35User-agent: Applebot-Extended
36Disallow: /
37
38User-agent: Bytespider
39Disallow: /
40
41User-agent: CCBot
42Disallow: /
43
44User-agent: ClaudeBot
45Disallow: /
46
47User-agent: CloudflareBrowserRenderingCrawler
48Disallow: /
49
50User-agent: Google-Extended
51Disallow: /
52
53User-agent: GPTBot
54Disallow: /
55
56User-agent: meta-externalagent
57Disallow: /
58
59# END Cloudflare Managed Content
60
61Sitemap: https://doaj.org/sitemap_index_0.xml
62Sitemap: https://doaj.org/sitemap_index_1.xml
63Sitemap: https://doaj.org/sitemap_index_2.xml
64
65User-agent: *
66Disallow: /search/
67Disallow: /query/
68Disallow: /account/
69Disallow: /admin/
70Disallow: /editor/
71Disallow: /publisher/
72Disallow: /cookie_consent