The most touted reason is that their anti-spam systems only support IPv4. Their old Cloudflare endpoint however is still alive and you can't disable IPv6 on Cloudflare so feel free to add the following to your /etc/hosts:
2606:4700::6810:686e news.ycombinator.com
Interestingly when I tried to post the above comment over IPv6 I got a Cloudflare "You have been blocked" page. This might be something they do not want you to know! :D
This was an interesting Cloudflare "feature" I found out about the hard way. Even if you only use Cloudflare for DNS hosting, they will happily accept proxied requests for your hostnames and route them to your origin. I discovered this when we received a L7 DDoS from only Cloudflare IPs - the attacker had pointed their bots at Cloudflare with our hostname (bold move!).
The official solution (and might be why you see the blocked page) is to set up the WAF to block all requests.
Yes, HTTP / HTTPS requests can be proxied this way. Any CF IP seems to work. HTTPS only works if the target hasn't disabled Universal SSL (i.e, they have a TLS cert provisioned on Cloudflare's IPs).
Doesn't that still only remove the records from DNS? So far for all Cloudflare sites that IPv6 disabled I've been able to derive the IPv6 address by hand and make requests without issues.
HN can do things like "This user is posting from an IP which geolocates far from where it normally posts from". It can take into account the total post history, user upvotes, etc.
Cloudflare bot detection is more request-by-request. Cloudflares product is more intended to prevent DDoS attacks with millions of bots. I don't think it's sufficiently fine tuned to prevent a handful of spam comments through.
Currently there are no proxies in front and you connect directly to their baremetal server hosting the site. I presume the anti-spam system is custom-built and part of their own codebase. Cloudflare is officially sanctioned, but retired from widespread use.