NeuralCrawl

SoundCloud / robots.txt snapshot

← back to soundcloud.com · fetched 2026-06-25T22:38:12Z (7h ago) · HTTP 200 · 1441 bytes · sha256 7bbee7524de61fe4 · raw

final URL: https://soundcloud.com/robots.txt

1# =============================================================================
2# robots.txt for soundcloud.com
3# Updated: 2026-05-05
4# =============================================================================
5
6# AI Crawlers: editorial only, no UGC training
7User-Agent: anthropic-ai
8User-Agent: ClaudeBot
9User-Agent: Claude-Web
10User-Agent: GPTBot
11User-Agent: ChatGPT-User
12User-Agent: OAI-SearchBot
13User-Agent: CCBot
14User-Agent: PerplexityBot
15User-Agent: Google-Extended
16User-Agent: Applebot-Extended
17User-Agent: Bytespider
18User-Agent: Amazonbot
19User-Agent: Meta-ExternalAgent
20User-Agent: cohere-ai
21
22# Homepage
23Allow: /$
24
25# Platform: Legal & Policy
26Allow: /terms-of-use
27Allow: /community-guidelines
28Allow: /transparency-reports
29Allow: /accessibility-statement
30Allow: /imprint
31
32# Platform: Discovery & Editorial
33Allow: /discover
34Allow: /stories
35Allow: /topic
36
37# Platform: Product & Marketing
38Allow: /pro
39Allow: /download
40Allow: /jobs
41Allow: /go
42Allow: /getstarted
43
44# Platform: Corporate
45Allow: /company
46
47# Platform: Technical
48Allow: /sitemap
49Allow: /sitemapIndex
50
51# Block everything else (catches all UGC at root paths)
52Disallow: /
53
54# Search engines and all other crawlers: index UGC, block low-value paths
55User-Agent: *
56Disallow: /search
57Disallow: /you/
58Disallow: /stream
59Disallow: /upload
60Disallow: /settings
61Disallow: /messages
62Disallow: /*?
63
64Sitemap: https://soundcloud.com/sitemap.xml
65Sitemap: https://soundcloud.com/sitemapIndex.xml