The title reminds me of the 5th installment of The Hitchhiker's Guide to the Galaxy by Douglas Adams:
"Further investigation quickly established what it was that had happened. A meteorite had knocked a large hole in the ship. The ship had not previously detected this because the meteorite had neatly knocked out that part of the ship's processing equipment which was supposed to detect if the ship had been hit by a meteorite."
The book ("Mostly harmless") and especially the beginning of the first chapter is worth reading as it describes how the automated systems of the space ship try to resolve the situation.
The page is 320KB in size. They could have made it a static page with some simple HTML, the whole thing would have been under 10KB and would not have needed a CDN.
The thing that worries me the most, is that oftentimes nobody cares. That demotivates me a lot, as I tend to invest huge loads of my time into optimising various things, and all of them are meaningless if you ‘just buy a faster computer.’ Most of my websites are served with a low-powered computer, and I tend to optimise them to work well on them. But buying just one beefy server compensates all my optimisations. I have no idea what to do about that. I still care about these things, as I believe that’s what makes me a professional. But there are countless examples when you can just ignore all that and see no real difference.
Bad news about ISPs... Really you want a RPi on solar power, attached to a longwave transmitter, and with direct peering agreements with all dominant global providers. Most well-connected rpi in existence.
I'm not affiliated with this genius. I was just snooping around the other thread (https://news.ycombinator.com/item?id=45974012), took a chance at modifying the site's URL, and found myself pleasantly surprised.
Just thinking about it, wouldn't a distributed P2P "mesh" be a better fit for reliability probing? We could share results, see where it was inaccessible from. It's kind of an oxymoron to have a centralized down detector lol
Sure, a p2p network of people doing distributed pings on a wide range of services sounds like a good idea. Of course, you'd need people willing to run it. A small incentive might be needed... or just a default of "if you want to use this software, you agree to also have your client ping other websites to check if they're up from your location".
Or—hear me out—we actually build services that leverage the native distributed infrastructure of the internet, so that we don't need down detectors. What a concept.
100% agree. But with most used services being pushed by coorps, it will remain centralized until the "distributed mesh" becomes at least as good/robust.
I think this is so important and in fact with services now becoming utilities for daily life and the national/global economy, it's something that people like DARPA could get behind. We understand why a big peering corp's incentives might not align with true distributed (and hence how they may lobby for the crippling of certain useful p2p APIs from being widely 'distributed'), but it's something we should really push for and technically just do. And we'd probably find many allies doing it in the continuity of system and reliability space.
I think we need to make a highly-available downdetector from a collection of SBCs hosted around the world. Each node gets its configuration via git-pull which is self-hosted/republished. Simplest DNS configuration possible: each node has a unique $n.isdowndetectordown.ultradowndetector.com while they also happily host a common hostname with simple dns round robin entries for it.isdowndetectordown.ultradowndetector.com. The common page attempts to load a check resource (perhaps just a tiny css output?) from all of the $n.i.u.c nodes which just changes a div from gray to green/red.
It would be interesting to see just how small this whole thing could be; I bet it could be made into a <500MB sdcard image for a RaspberryPi4/2GB that simply updates a static css out of (say) cron and serves a surprising number of HN requests.
With all of this redundancy, there is no way it could fail! /s
"Further investigation quickly established what it was that had happened. A meteorite had knocked a large hole in the ship. The ship had not previously detected this because the meteorite had neatly knocked out that part of the ship's processing equipment which was supposed to detect if the ship had been hit by a meteorite."
The book ("Mostly harmless") and especially the beginning of the first chapter is worth reading as it describes how the automated systems of the space ship try to resolve the situation.
reply