no? it takes 10 seconds to check:
> The /crawl endpoint respects the directives of robots.txt files, including crawl-delay. All URLs that /crawl is directed not to crawl are listed in the response with "status": "disallowed".
You don't need any scraping countermeasures for crawlers like those.
So what’s the user agent for their bot? They don’t seem to specify the default in the docs and it looks like it’s user configurable. So yet another opt out bot which you need your web server to match on special behaviour to block
Isn't this covered here? https://developers.cloudflare.com/browser-rendering/referenc...
No, hence all their examples using User-Agent: *
>So yet another opt out bot which you need your web server to match on special behaviour to block
Given that malicious bots are allegedly spoofing real user agents, "another user agent you have to add to your list" seems like the least of your problems.
It is cloudflare who made the claim that they are well behaved unlike those other bots and that their behaviour can be controlled by robots.txt
If I need to treat cloudflare bots the same as malicious bots, that undermines their claim.
Not 'allegedly' - it's just a fact. Even if you're not malicious however it's still sometimes necessary because the server may have different sites for different browsers and check user agents for the experience they deliver. So then even for legitimate purposes you need to at least use the prefix of the user agent that the server expects.
Like they explain in the docs, their crawler will respect the robots.txt dissalowed user-agents, right after the section hat explains how to change your user-agent.