> Actually, now I remember (it was five years ago or about so - quite a while so...

drdaeman · on Aug 26, 2024

> Neither of these are k8s issues though. Where were you playing with `conntrack`? On the backplane?

Yes, on the host (GKE-provisioned VPS) where the application container ran.

While it's certain this is not in K8s itself, I'm not really sure where to draw the line. I mean, IIRC, K8s relies on kernel's networking code quite a lot (e.g. kube-proxy is all about that), so... I guess it's not precisely clear if it's in or out of scope.

But either way, they're still certainly GKE issues, because the whole thing was provisioned as GKE K8s cluster, where I think I wasn't really supposed to actually SSH to individual nodes and do something there.

> The issues you describe here are issues you created for the most part. They are not issues people run into in production with k8s, I can assure you of that.

Entirely irrespective of K8s or anything else... people don't create weird issues for themselves in production? ;-) I honestly suspect making sub-optimal decisions and reaping their unintended consequences is one thing that makes us human :-) And I'm sure someone out there right now tries some weird stuff in production because they thought it would be a good idea. Maybe even with K8s (but not exactly likely - people hack on complex systems less than on simple systems).

By the way, if you say connectivity hiccups aren't a thing in production-grade K8s, I really wonder what kind of issues people run into?

> Most likely something with your cloud provider

I remember that node-to-node host communications had worked and database was responsive, but the container had connection timeouts, which is why I suspect it was something with K8s.

But, yes, of course, it's possible it wasn't exactly a K8s issue but something with the cloud itself - given that I don't know what was the problem back then, I can't really confirm or refute this.