Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I can't imagine a worse combination than Kubernetes and stateful connections.


It only hurts when you actually have meaningful load and then suddenly needs to switch. Especially if the "servlets" that those stateful connections are connected to require some heavy-ish work on startup, so you're vulnerable to the "thundering herd" scenario.

But the author only uses it to keep alive a couple of IRC connections (which don't send you history or anything on re-connects) and to automatically backup their "huge" chat logs (seriously, 5 GiB is not huge, and if it's text then it can be compressed down to about 2 GiB — unless it's already compressed?).


You dont have to roll all the pods at the same time - there are built-in controls to avoid doing that and it’s the default. You will have to diy this if you’re using something else so, in fact, tp is wrong that k8s is somehow a bad fit for this use case


> You dont have to roll all the pods at the same time

That's not really the problem — if, say, one of your nodes drops dead (or just drops off the network), the clients' connections also drop, and they all try to reconnect. That just happens and there is not much you can do to prepare for it except by having some idle capacity already available.

Unless you're talking about rolling out strategies for deployment updates, and to be fair I don't remember the controls for that being all that useful but that was 2 years ago, so perhaps things are better now.


Having idling capacity is standard industry practice though. A secondary node in a primary/secondary setup of a typical monolith design is basically idle capacity except more expensive because it’s 100% over-provisioning which is not required with k8s


It's only a problem if your nodes go up/down often, or you have other things causing pods to be pre-empted/etc.

If you have a static number of nodes and don't have to worry too much about things autoscaling, I don't see why it couldn't be really stable?


You don’t?

Check out how services, load balancers and the majority of CNI actually work then.

Kubernetes was designed for stateless connections and it shows in many places.

If you want it to do stateful connections you could use something like Agones which intentionally bypasses a huge amount of kubernetes to use it only as a scheduler essentially.


> You don’t?

No, why do yours? :D

If you're using cluster autoscaling with very small (or perfectly sized) nodes, I could see it being more of an issue on a busy cluster.

But even then, I wouldn't set up a database to auto-scale. A new node could get created, but it doesn't mean the db pods will be moved to it. They'd ideally stay in the same location. And on a really busy cluster, I'd prefer a separate node pool for stateful apps.

Using something like Stackgres makes it relatively painless to run postgres in k8s too, it handles setting up replicas and can do automatic failover.


A lot of the CNI/load balancer stuff was added as band aid for applications that don't cooperate nicely with k8s.

Applications that act "native" and don't need a lot of the extras...

Well, they arguably mostly use just the scheduler then :D


Wait? you can run kubernetes with no CNI? My clusters have never even been able to register nodes as healthy without one.

Maybe I’m doing it wrong?


TL;DR - today the CNI itself is interface to network implementation, so you'd need a minimal one.

But you do not need a "complex" CNI. Originally k8s pretty much worked with assumption you can route few subnets in good old static way to the cluster and that's it, and it works with that kind of approach still - each node gets a /24, there's a separate shared /24 (or more) for services, etc.

The complexities came from the fact that a lot of places that wanted to deploy kubernetes couldn't provide such a simple network infrastructure to hosts, then later what was a workaround got equipped with various extra bells&whistles


I looked at Agones - the docs on architecture are non-existent but from their ops docs it looks like a crd extension on top of vanilla kubernetes to automate/simplify scheduling. What specifically in cni or its most popular implementations prevents long running connections in your opinion?


First: they force kubernetes into a position where pods can’t be evicted.

Second: they use a version of node ports that bypasses CNI, so you directly connect to the process living on the node. This means there’s no hiccups with CNI if another node (or pod) gets unscheduled that had nothing to do with your process.

In most cases, web services will be fine with the kinds of hiccups I’m talking about (even websockets); however UDP streams will definitely lose data - and raw TCP ones may fail depending on the implementation.


What you're describing sounds like implementation bugs in the specific CNIs you've used not anything to do with the k8s networking design in general. At former gig I ran a geo-distributed edge with long, persistent connections over Cilium and we had no issues sustaining 12h+ RTMP connections while scaling/downscaling and rolling pods on the same nodes. I've consulted for folks who did RTP (for WebRTC) which is UDP-based also with no issues. In fact, where we actually had issues was cloud load-balancing infra which in a lot of cases is not designed for long-running streams...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: