NeuralCrawl

SoundCloud / robots.txt

← back to soundcloud.com · change detected 5h ago (2026-06-25T22:38:12Z) · modified +60 −29

What changed

Diff old (2026-06-20T01:10:31Z)new (2026-06-25T22:38:12Z)

@@ -1,34 +1,65 @@
1- # This is for the m.soundcloud.com hostname
1+ # =============================================================================
2+ # robots.txt for soundcloud.com
3+ # Updated: 2026-05-05
4+ # =============================================================================
25 
3- User-Agent: *
4- Disallow:
6+ # AI Crawlers: editorial only, no UGC training
7+ User-Agent: anthropic-ai
8+ User-Agent: ClaudeBot
9+ User-Agent: Claude-Web
10+ User-Agent: GPTBot
11+ User-Agent: ChatGPT-User
12+ User-Agent: OAI-SearchBot
13+ User-Agent: CCBot
14+ User-Agent: PerplexityBot
15+ User-Agent: Google-Extended
16+ User-Agent: Applebot-Extended
17+ User-Agent: Bytespider
18+ User-Agent: Amazonbot
19+ User-Agent: Meta-ExternalAgent
20+ User-Agent: cohere-ai
521 
6- User-Agent: anthropic-ai
22+ # Homepage
23+ Allow: /$
24+
25+ # Platform: Legal & Policy
26+ Allow: /terms-of-use
27+ Allow: /community-guidelines
28+ Allow: /transparency-reports
29+ Allow: /accessibility-statement
30+ Allow: /imprint
31+
32+ # Platform: Discovery & Editorial
33+ Allow: /discover
34+ Allow: /stories
35+ Allow: /topic
36+
37+ # Platform: Product & Marketing
38+ Allow: /pro
39+ Allow: /download
40+ Allow: /jobs
41+ Allow: /go
42+ Allow: /getstarted
43+
44+ # Platform: Corporate
45+ Allow: /company
46+
47+ # Platform: Technical
48+ Allow: /sitemap
49+ Allow: /sitemapIndex
50+
51+ # Block everything else (catches all UGC at root paths)
752  Disallow: /
853 
9- User-Agent: ClaudeBot
10- Disallow: /
54+ # Search engines and all other crawlers: index UGC, block low-value paths
55+ User-Agent: *
56+ Disallow: /search
57+ Disallow: /you/
58+ Disallow: /stream
59+ Disallow: /upload
60+ Disallow: /settings
61+ Disallow: /messages
62+ Disallow: /*?
1163 
12- User-Agent: CCBot
13- Disallow: /
14-
15- User-Agent: GPTBot
16- Disallow: /
17-
18- User-Agent: Google-Extended
19- Disallow: /
20-
21- User-Agent: Meta-ExternalAgent
22- Disallow: /
23-
24- User-Agent: omgili
25- Disallow: /
26-
27- User-Agent: omgilibot
28- Disallow: /
29-
30- User-Agent: Brightbot
31- Disallow: /
32-
33- sitemap: https://soundcloud.com/sitemap.xml
34- sitemap: https://soundcloud.com/sitemapIndex.xml
64+ Sitemap: https://soundcloud.com/sitemap.xml
65+ Sitemap: https://soundcloud.com/sitemapIndex.xml