It's unfortunate that the solution to cloud pricing complexity that all providers are adopting is – add even more complexity on top.
The number you see on your bill is increasingly calculated by running some black box algorithm on top of the billing events your resources generate. Was it accidental or not? What is a "weird" deployment vs a normal deployment? By what factor should the spikes on your billing graph be smoothened? None of this can be deterministically calculated from the pricing page. And there's no way for you to do these checks yourself before deployment because you have no idea what this logic even is. So you are entirely at the mercy of the provider for "forgiveness".
Who wants to bet that some provider is going to launch "AI cloud billing" within the next year?
Competition only matters for new contracts. Once you have picked a provider and made a big enough infrastructure investment, there's no realistic path to switch to someone else.
Keeping everything, including mission critical databases, "inside" Kubernetes is smart. The smart people keep everything "in" Kubernetes. Everyone else can pay up.
Like everything else it's a tradeoff. If you are running _everything_ inside Kubernetes you can easily move away, but I think you're loosing much of the benefit of being in the $big_cloud to begin with. If you have the staffing to provide your own db's, block storages, caches, IAM, logging/monitoring, container registries etc in a reliable way "as a service", the jump to some much cheaper bare metal hosting is not that far.
For me the sweet spot is to have all compute in Kubernetes and stick to open source or "standard" (e.g S3) services for your auxiliary needs, but outsource it to the cloud provider. Fairly easy to move somewhere else, while still keeping the operational burden down.
But agree that having e.g all your databases in a proprietary solution from the cloud vendor seems sketchy.
Yeah, not _everything_, but almost or "pretty much" everything. The main exception to "everything" ultimately being S3, and some monitoring exceptions here and there that are purely cloud-side like monitoring AWS' Service Control Policies and using some cloud-side AWS tooling.
AWS has very nice one-to-one mapping of K8s serviceAccount with IAM roles.
We used some cryptography-centric GitOps patterns to eliminate any human hands beyond a dev environment which also helps IAM be easier (but of no less reasonable granularity and quality).
> the jump to some much cheaper bare metal hosting is not that far.
Heh, at a consulting firm I was at not too long ago, all the K8s nodes were sole-tenant on the cloud providers these K8s nodes were on. Intra-cluster Cilium-based Pod-to-Pod networking across cloud-based datacenter sites has been super smooth, but I have to admit I'm probably tainted/biased by that team's uncommon access to talent.
Don't be ridiculous. To the nearest 9 decimal points, nobody keeps their mission critical database inside Kubernetes, and the K8s tax is excessive for essentially all cloud users anyway.
The last time I was involved in this kind of thing (long enough but not too long ago), we ran Postgres and other databases strictly in K8s per a mandate by our CTO. Using reserved instances are great for avoiding a lot of cost.
I think for that particular firm, "K8s tax" measured in fiat currency is negligible and anything on the human side their people upon hearing "K8s tax" would respond with some ridicule along the lines of "someone doesn't know how to computer".
To be fair, most of the commenters on HackerNews should use something like Heroku.
This is absolutely not true. Customers regularly shift spends in the >= 7-digit between cloud providers. Yes, it takes planning, and it takes many engineer-months of work, but it definitely happens.
And also factor in (1) the claim that most cloud growth is ahead of us, eg. moving large customers from on-prem to cloud, and (2) it would be terrible policy to try to charge existing customers more than new customers.
>> it would be terrible policy to try to charge existing customers more than new customers.
Companies do this very, very often. This is part of the reason why they have a "call the sales department for special pricing" option. They can give large contracts a nice discount to get them onboard, then slowly (or in some cases, if they really think they have you hooked, not so slowly) ratchet up the price. This is common in both B2c and B2B businesses.
Then how do AWS, Azure and GCP charge 100 times the amount for bandwidth as other hosting providers and the IP transit quote sitting in my email inbox?
Because they land your data in their systems by offering you a sweetheart deal in year 1-2 and then they jack the price back to the extortionate level once you're stuck. Compute is portable between providers, but data is not.
I've often wondered about this. I'm guessing large customers can all negotiate extremely deep discounts on bandwidth from all three providers. Smaller customers may not be paying that much in bandwidth that it would be decisive. I think also that if say GCP cut its bandwidth charges by 10x they might attract customers they really don't want.
A beligerent that so completely outclasses their opponent that they can inflict existentially threatening losses with so little effort that its expenditure is indistinguishable from normal overhead
Alternatively, when Stitches managed to ambush your level appropriate character alone in Duskwood
Part 1 is bandwidth prices vary tremendously by location, but clouds would like to have customers in expensive regions too, and if US overcharges at 100x, and Brazil overcharges at 25x, the customer will only have to pay 2x for bandwidth in Brazil. Not a lot of cheap hosting in Brazil, from what I've seen, but there are a lot of users in Brazil, attracting cloud customers justifies putting more cloud hardware there which benefits the cloud.
The other part is that bandwidth is easy to measure and broadly correlates with general usage. On a shared system, you can't really meter watt hours, but you can meter bandwidth. Bandwidth charges are how the difficult accounting is reconciled so that the cloud can charge enough to hit their desired margins. If bandwidth was just a cost plus, other things would cost more; if everything had to be cost plus, accounting would be much more difficult for everyone (and it's already pretty non transparent)
If the cost calculation is complex and opaque, that successfully prevents anyone from evaluating what their costs would be on a competing service. And vendor lock-in makes it expensive and nontrivial to simply try it out.
I'd recommend thinking of this (and other accident forgiveness schemes from competitors) as a gesture of goodwill that rarely happens rather than an official part of the billing policy.
If you actually look at your contract, no cloud provider is going to contractually obligate themselves to forgive your bill, and you shouldn't be planning or predicting your bill based on it.
Anyone else remember we had widely available and completely understandable options to rent everything from baseline web-hosting all the way up to private rack services for actual years before AWS came along and apparently all of that was forgotten, Warhammer 40K style?
I've rented a VPS from a vendor for going on 20 years now (Holy fuck I'm old) and I've never once been surprised at the bill.
You know that book "JavaScript: The Good Parts"? There needs to be a similar one "AWS: just the good parts". It would probably talk about EC2 (VPS), S3 (cheap bulk storage), SES (emails), and that's about it. When folks get into the elastic-super-beanstalk-container-swarm-v0.3 products, that's when they really kill themselves on the bills.
That said, yes, just using a VPS vendor is the easy way to stick to the good parts.
You still can. My company rents space from Cogent communications, and they are very good at customer service, although the offering of "rent a cage and get a pipe" is a lot more DIY than AWS.
It's far from perfect, but the large providers do give you tools for categorizing your charges (tagging, etc). There's some fuzziness, especially around data transfer, but for the most part developers can look at an application and know what the cost of each resource is. The biggest risk seems to be out-of-control auto scaling turned on without doing reasonable analysis first.
The number you see on your bill is increasingly calculated by running some black box algorithm on top of the billing events your resources generate. Was it accidental or not? What is a "weird" deployment vs a normal deployment? By what factor should the spikes on your billing graph be smoothened? None of this can be deterministically calculated from the pricing page. And there's no way for you to do these checks yourself before deployment because you have no idea what this logic even is. So you are entirely at the mercy of the provider for "forgiveness".
Who wants to bet that some provider is going to launch "AI cloud billing" within the next year?