More

jdwyah · 2025-01-23T03:42:21 1737603741

If I could go back in time I would stop myself from ever learning about gRPC. I was so into the dream, but years later way too many headaches. Don’t do it to yourself.

Saying gRPC hides the internals is a joke. You’ll get internals all right, when you’re blasting debug logging trying to figure out what the f is going on causing 1/10 requests to fail and fine tuning 10-20 different poorly named and timeout / retry settings.

Hours lost fighting with maven plugins. Hours lost debugging weird deadline exceeded. Hours lost with LBs that don’t like the esoteric http2. Firewall pain meaning we had to use Standard api anyway. Crappy docs. Hours lost trying to get error messages that don’t suck into observability.

I wish I’d never heard of it.

stickfigure · 2025-01-23T04:45:24 1737607524

IMO the problem with gRPC isn't the protocol or the protobufs, but the terrible tooling - at least on the Java end. It generates shit code with awful developer ergonomics.

When you run the protobuf builder...

* The client stub is a concrete final class. It can't be mocked in tests.

* When implementing a server, you have to extend a concrete class (not an interface).

* The server method has an async method signature. Screws up AOP-oriented behavior like `@Transactional`

* No support for exceptions.

* Immutable value classes yes, but you have to construct them with builders.

The net result is that if you want to use gRPC in your SOA, you have to write a lot of plumbing to hide the gRPC noise and get clean, testable code.

There's no reason it has to be this way, but it is that way, and I don't want to write my own protobuf compiler.

Thrift's rpc compiler has many of the same problems, plus some others. Sigh.

bjackman · 2025-01-23T09:29:12 1737624552

> The client stub is a concrete final class. It can't be mocked in tests.

I believe this is deliberate, you are supposed to substitute a fake server. This is superior in theory since you have much less scope to get error reporting wrong (since errors actually go across a gRPC transport during the test).

Of course.. at least with C++, there is no well-lit-path for actually _doing_ that, which seems bonkers. In my case I had to write a bunch of undocumented boilerplate to make this happen.

IIUC for Stubby (Google's internal precursor to gRPC) those kinda bizarre ergonomic issues are solved.

Degorath · 2025-01-23T11:10:37 1737630637

Stubby calls (at least in Java) just use something called a GenericServiceMocker which is akin to a more specialised mockito.

tbarbugli · 2025-01-23T16:48:18 1737650898

In my experience, only Swift has a generator that produces good-quality code. Ironically, it’s developed by Apple.

rkagerer · 2025-01-23T08:01:49 1737619309

Any alternatives that take a similar philosophy but get the tooling right?

stickfigure · 2025-01-23T14:44:32 1737643472

Depends what you mean by "similar philosophy". We (largeish household name though not thought of as a tech company) went through a pretty extensive review of the options late last year and standardized on this for our internal service<->service communication:

https://github.com/stickfigure/trivet

It's the dumbest RPC protocol you can imagine, less than 400 lines of code. You publish a vanilla Java interface in a jar; you annotate the implementation with `@Remote` and make sure it's in the spring context. Other than a tiny bit of setup, that's pretty much it.

The main downside is that it's based on Java serialization. For us this is fine, we already use serialization heavily and it's a known quantity for our team. Performance is "good enough". But you can't use this to expose public services or talk to nonjava services. For that we use plain old REST endpoints.

The main upsides are developer ergonomics, easy testability, spring metrics/spans pass through remote calls transparently, and exceptions (with complete stacktraces) propagate to clients (even through multiple layers of remote calls).

I wrote it some time ago. It's not for everyone. But when our team (well, the team making this decision for the company) looked at the proof-of-concepts, this is what everyone preferred.

p_l · 2025-01-24T09:43:56 1737711836

Yes, it's good for internal use.

Caveat is when you need to go elsewhere. I still remember the pain of Hadoop ecosystem having this kind of API

crabbone · 2025-01-23T13:18:35 1737638315

Protobuf is an atrocious protocol. Whatever other problems gRPC has may be worse, but Protobuf doesn't make anything better that's for sure.

The reason to use it may be that you are required to by the side you cannot control, or this is the only thing you know. Otherwise it's a disaster. It's really upsetting that a lot of things used in this domain are the first attempt by the author to make something of sorts. So many easily preventable disasters exist in this protocol for no reason.

morganherlocker · 2025-01-23T18:25:57 1737656757

Agree. As an example, this proto generates 584 lines of C++, links to 173k lines of dependencies, and generates a 21Kb object file, even before adding grpc:

syntax = "proto3"; message LonLat { float lon = 1; float lat = 2; }

Looking through the generated headers, they are full of autogenerated slop with loads of dependencies, all to read a struct with 2 primitive fields. For a real monorepo, this adds up quickly.

cyberax · 2025-01-23T18:40:37 1737657637

This is because protobuf supports full run-time reflection and compact serialization (protobuf binary objects are not self-describing), and this requires a bit of infrastructure.

This is a large chunk of code, but it is a one-time tax. The incremental size from this particular message is insignficant.

bellgrove · 2025-01-23T15:19:56 1737645596

Can you elaborate?

crabbone · 2025-01-24T15:41:26 1737733286

Some very obvious and easily avoidable problems (of the binary format):

* Messages are designed in such a way that only the size of the constituents is given. The size of the container message isn't known. Therefore the top-level message doesn't record its size. This requires one to invent an extra bit of the binary format, when they decide how to delimit top-level messages. Different Protobuf implementations do it differently. So, if you have two clients independently implementing the same spec, it's possible that both will never be able to communicate with the same service. (This doesn't happen a lot in practice, because most developers use tools to generate clients that are developed by the same team, and so, coincidentally they all get the same solution to the same problem, but alternative tools exist, and they actually differ in this respect).

* Messages were designed in such a way as to implement "+" operator in C++. A completely worthless property. Never used in practice... but this design choice made the authors require that repeating keys in messages be allowed and that the last key wins. This precludes SAX-like parsing of the payload, since no processing can take place before the entire payload is received.

* Protobuf is rife with other useless properties, added exclusively to support Google's use-cases. Various containers for primitive types to make them nullable. JSON conversion support (that doesn't work all the time because it relies on undocumented naming convention).

* Protobuf payload doesn't have a concept of version / identity. It's possible, and, in fact, happens quite a bit, that incorrect schema is applied to payload, and the operation "succeeds", but, the resulting interpretation of the message is different from intended.

* The concept of default values, that is supposed to allow for not sending some values is another design flaw: it makes it easy to misinterpret the payload. Depending on how the reader language deals with absence of values, the results of the parse will vary, sometimes leading to unintended consequences.

* It's not possible to write a memory-efficient encoder because it's hard / impractical sometimes to calculate the length of the message constituents, and so, the typical implementation is to encode the constituents in a "scratch" buffer, measure the outcome, and then copy from "scratch" to the "actual" buffer, which, on top of this, might require resizing / wasting memory for "padding". If, on the other hand, the implementation does try to calculate all the lengths necessary to calculate the final length of the top-level message, it will prevent it from encoding the message in a single pass (all components of the message will have to be examined at least twice).

----

Had the author of this creation tried to use it for a while, he'd known about these problems and would try to fix them, I'm sure. What I think happened is that it was the first ever attempt for the author in doing this, and he never looked back, switching to other tasks, while whoever picked up the task after him was too scared to fix the problems (I hear the author was a huge deal in Google, and so nobody would tell him how awful his creation was).

kyrra · 2025-01-26T03:04:10 1737860650

> Had the author of this creation tried to use it for a while,...

The problem is that proto v1 has existed for over 20 years internally at Google. And being able to be backwards compatible is extremely important.

Edit. Oh. You're an LLM

dtquad · 2025-01-23T04:56:09 1737608169

Your problems has more to do with some implementations than the grpc/protobuf specs themselves.

The modern .NET and C# experience with gRPC is so good that Microsoft has sunset its legacy RPC tech like WCF and gone all in on gRPC.

junto · 2025-01-23T05:23:37 1737609817

Agreed. The newest versions of .NET are now chef’s kiss and so damn fast.

zigzag312 · 2025-01-23T09:39:29 1737625169

I would really like if proto to C# compiler would create nullable members. Hasers IMO give poor DX and are error prone.

hedora · 2025-01-23T04:06:09 1737605169

The biggest project I’ve used it with was in Java.

Validating the output of the bindings protoc generated was more verbose and error prone than hand serializing data would have been.

The wire protocol is not type safe. It has type tags, but they reuse the same tags for multiple datatypes.

Also, zig-zag integer encoding is slow.

Anyway, it’s a terrible RPC library. Flatbuffer is the only one that I’ve encountered that is worse.

TeeWEE · 2025-01-23T04:13:19 1737605599

What do you mean with validating the bindings? GRPC is type safe. You don’t have to think about that part anymore.

But as the article mentions OpenAPI is also an RPC library with stub generation.

Manual parsing of the json is imho really Oldskool.

But it depends on your use case. That’s the whole point: it depends.

matrix87 · 2025-01-23T05:11:37 1737609097

> The wire protocol is not type safe. It has type tags, but they reuse the same tags for multiple datatypes.

When is this ever an issue in practice? Why would the client read int32 but then all of a sudden decide to read uint32?

sagarm · 2025-01-24T21:48:09 1737755289

I guess backwards incompatible changes to the protocol? But yeah, don't do that if you're using protobuf; it's intentionally not robust to it.

bborud · 2025-01-23T08:40:17 1737621617

Since you mention Maven I'm going to make the assumption that you are using Java. I haven't used Java in quite a while. The last 8 years or so I've been programming Go.

Your experience of gRPC seems to be very different from mine. How much of the difference in experience do you think might be down to Java and how much is down to gRPC as a technology?

piva00 · 2025-01-23T10:35:51 1737628551

It's not Java itself, it's design decisions on the tooling that Google provides for Java, mostly the protobuf-gen plugin.

At my company we found some workarounds to the issues brought up on GP but it's annoying the tooling is a bit subpar.

bborud · 2025-01-23T12:35:27 1737635727

Have you tried the buf.build tools? Especially the remote code generation and package generation may make life easier for you.

a couple of links

https://buf.build/protocolbuffers/java?version=v29.3 https://buf.build/docs/bsr/generated-sdks/maven

divan · 2025-01-23T07:41:56 1737618116

I use gRPC with Go+Dart stack for years and never experienced these issues. Is it something specific to Java+gRPC?

robertlagrant · 2025-01-23T10:04:56 1737626696

Go and Dart are probably the languages most likely to work well with gRPC, given their provenance.

throwaway127482 · 2025-01-23T15:48:08 1737647288

Google has massive amounts of code written in Java so one would think the Java tooling would be excellent as well.

9rx · 2025-01-23T23:14:56 1737674096

Doesn't Google mostly use Stubby internally, only bridging it with gRPC for certain public-facing services?

p_l · 2025-01-24T09:38:05 1737711485

Google also uses a completely different protocol stack to actually send Stubby/Protobuf/gRPC around, including protocols on the wire and bypassing the kernel (according to open access papers about PonyExpress etc)

drtse4 · 2025-01-23T09:30:01 1737624601

As someone that used it for years with the same problems he describes... spot on analysis, the library does too much for you (e.g. reconnection handling) and handling even basic recovery is a bit a nuisance for newbies. And yes, when you get random failures good luck figuring out that maybe is just a router in the middle of the path dropping packets because their http2 filtering is full of bugs.

I like a lot of things about it and used it extensively instead of the inferior REST alternative, but I recommend to be aware of the limitations/nuisances. Not all issues will be simply solved looking at stackoverflow.

azemetre · 2025-01-23T04:40:46 1737607246

What would you recommend doing instead?

Atotalnoob · 2025-01-23T19:35:41 1737660941

Web sockets would probably be easy.

Some web socket libraries support automatic fallback to polling if the infrastructure doesn’t support web sockets.

doctorpangloss · 2025-01-23T05:22:29 1737609749

Do you need bidirectional streams? If so, you should write a bespoke protocol, on top of UDP, TCP or websockets.

If you don't, use GraphQL.

nithril · 2025-01-23T07:15:59 1737616559

"Write a protocol and GraphQL", god damn it escalates quickly.

Fortunately, there are intermediate steps.

grumbelbart2 · 2025-01-23T11:16:10 1737630970

Any suggestions for a good RPC library?

masterj · 2025-01-23T15:52:20 1737647540

I have had a really good experience with https://connectrpc.com/ so far. Buf is doing some interesting things in this space https://buf.build/docs/ecosystem/

zeroc8 · 2025-01-23T20:25:42 1737663942

I've used twitchtv/twirp with success. I like it because it's simple and doesn't reinvent itself over and over again.

galangalalgol · 2025-01-23T13:15:07 1737638107

What about songle directional streams? Graphql streams aren't widely supported yet are they? Graphql also strikes me as a weird alternative to protobufs as the latter works so hard for performance with binary payloads, and graphql is typically human readable bloaty text. And they aren't really queries, you can just choose to ignore parts of the return for a rpc.

jdwyah · 2024-10-23T16:23:10 1729700590

Spanner has been a terrific part of our architecture. It's a great piece of tech and very impressive from a cost/scalability/ease-of-use perspective.

It's particularly worth taking a look if you have a postgres table that has the potential to become enormous.

jdwyah · 2024-07-17T17:55:56 1721238956

This is such an awesome idea.

I remember spending an extra $30 to at 2am to overnight "Gripe Water" in the hope that it would get my kid to sleep. #takeMyMoney

jdwyah · 2024-06-27T02:25:31 1719455131

"A breakpoint for logging is usually scalability, because they are expensive to store and index."

I hope 2024 is the year where we realize that if we make the log levels dynamically update-able we can have our cake and eat it too. We feel stuck in a world where all logging is either useless bc it's off or on and expensive. All you need is a way to easily modify log level off without restarting and this gets a lot better.

rendaw · 2024-06-27T02:56:32 1719456992

That's probably true for some uses of logging, but for information about historical events you're stuck with whatever information you get from the log level you had in the past.

jdwyah · 2024-05-02T14:06:13 1714658773

I have such a distinct memory of being at a friend's house and seeing the California raisins. It was the coolest thing I'd ever seen.

Feels like 100 years ago.

rappatic · 2024-05-02T14:08:09 1714658889

I'd almost forgotten about the California raisins until I came across this article. The original Gorillaz.

jdwyah · 2024-05-02T13:51:41 1714657901

Perplexity has definitely stolen >10% of my search traffic.

Just got meta glasses. Curious whether they'll steal another 15% of my traffic. Primarily the "looking up a random fact for my kids" volume.

jdwyah · 2024-04-22T01:42:18 1713750138

“But no one shook me by the shoulders, saying how crazy that was.”

If you’re lookin for a book that has absolutely kept this sense of wonder, “Immune” by Philipp Dettmer has this in spades. Highly recommended.

jdwyah · 2024-04-01T12:22:53 1711974173

super easy. fun to play with. fast.

we screwed around with it on a live stream: https://www.youtube.com/live/3YhBoox4JvQ?si=dkni5LY3EALnWVuE...

If you're writing something that will run on someone's local machine I think we're at the point where you can start building with the assumption that they'll have a local, fast, decent LLM.

auggierose · 2024-04-01T13:31:13 1711978273

> If you're writing something that will run on someone's local machine I think we're at the point where you can start building with the assumption that they'll have a local, fast, decent LLM.

I don't believe that at all. I don't have any kind of local LLM. My mother doesn't, either. Nor does my sister. My girl-friend? Nope.

jdwyah · 2024-03-28T13:21:48 1711632108

I had a day of it last week, debugging Rails / Kubernetes / Memory usage & OOM kills. I wrote it up so that maybe you can avoid my fate.

I won't spoil the surprise, but I did end up with one takeaway that I'll remember going forward when diagnosing.

jdwyah · 2024-03-21T10:46:29 1711017989

Jeffrey & I build a rate limiter in Ruby: - visualizing a rate limiter - leaky bucket vs token bucket - capacity vs rate - dynamic rate limits - use cases for rate limits