If I could go back in time I would stop myself from ever learning about gRPC. I was so into the dream, but years later way too many headaches. Don’t do it to yourself.
Saying gRPC hides the internals is a joke. You’ll get internals all right, when you’re blasting debug logging trying to figure out what the f is going on causing 1/10 requests to fail and fine tuning 10-20 different poorly named and timeout / retry settings.
Hours lost fighting with maven plugins. Hours lost debugging weird deadline exceeded. Hours lost with LBs that don’t like the esoteric http2. Firewall pain meaning we had to use Standard api anyway. Crappy docs. Hours lost trying to get error messages that don’t suck into observability.
IMO the problem with gRPC isn't the protocol or the protobufs, but the terrible tooling - at least on the Java end. It generates shit code with awful developer ergonomics.
When you run the protobuf builder...
* The client stub is a concrete final class. It can't be mocked in tests.
* When implementing a server, you have to extend a concrete class (not an interface).
* The server method has an async method signature. Screws up AOP-oriented behavior like `@Transactional`
* No support for exceptions.
* Immutable value classes yes, but you have to construct them with builders.
The net result is that if you want to use gRPC in your SOA, you have to write a lot of plumbing to hide the gRPC noise and get clean, testable code.
There's no reason it has to be this way, but it is that way, and I don't want to write my own protobuf compiler.
Thrift's rpc compiler has many of the same problems, plus some others. Sigh.
> The client stub is a concrete final class. It can't be mocked in tests.
I believe this is deliberate, you are supposed to substitute a fake server. This is superior in theory since you have much less scope to get error reporting wrong (since errors actually go across a gRPC transport during the test).
Of course.. at least with C++, there is no well-lit-path for actually _doing_ that, which seems bonkers. In my case I had to write a bunch of undocumented boilerplate to make this happen.
IIUC for Stubby (Google's internal precursor to gRPC) those kinda bizarre ergonomic issues are solved.
Depends what you mean by "similar philosophy". We (largeish household name though not thought of as a tech company) went through a pretty extensive review of the options late last year and standardized on this for our internal service<->service communication:
It's the dumbest RPC protocol you can imagine, less than 400 lines of code. You publish a vanilla Java interface in a jar; you annotate the implementation with `@Remote` and make sure it's in the spring context. Other than a tiny bit of setup, that's pretty much it.
The main downside is that it's based on Java serialization. For us this is fine, we already use serialization heavily and it's a known quantity for our team. Performance is "good enough". But you can't use this to expose public services or talk to nonjava services. For that we use plain old REST endpoints.
The main upsides are developer ergonomics, easy testability, spring metrics/spans pass through remote calls transparently, and exceptions (with complete stacktraces) propagate to clients (even through multiple layers of remote calls).
I wrote it some time ago. It's not for everyone. But when our team (well, the team making this decision for the company) looked at the proof-of-concepts, this is what everyone preferred.
Protobuf is an atrocious protocol. Whatever other problems gRPC has may be worse, but Protobuf doesn't make anything better that's for sure.
The reason to use it may be that you are required to by the side you cannot control, or this is the only thing you know. Otherwise it's a disaster. It's really upsetting that a lot of things used in this domain are the first attempt by the author to make something of sorts. So many easily preventable disasters exist in this protocol for no reason.
Agree. As an example, this proto generates 584 lines of C++, links to 173k lines of dependencies, and generates a 21Kb object file, even before adding grpc:
Looking through the generated headers, they are full of autogenerated slop with loads of dependencies, all to read a struct with 2 primitive fields. For a real monorepo, this adds up quickly.
This is because protobuf supports full run-time reflection and compact serialization (protobuf binary objects are not self-describing), and this requires a bit of infrastructure.
This is a large chunk of code, but it is a one-time tax. The incremental size from this particular message is insignficant.
Some very obvious and easily avoidable problems (of the binary format):
* Messages are designed in such a way that only the size of the constituents is given. The size of the container message isn't known. Therefore the top-level message doesn't record its size. This requires one to invent an extra bit of the binary format, when they decide how to delimit top-level messages. Different Protobuf implementations do it differently. So, if you have two clients independently implementing the same spec, it's possible that both will never be able to communicate with the same service. (This doesn't happen a lot in practice, because most developers use tools to generate clients that are developed by the same team, and so, coincidentally they all get the same solution to the same problem, but alternative tools exist, and they actually differ in this respect).
* Messages were designed in such a way as to implement "+" operator in C++. A completely worthless property. Never used in practice... but this design choice made the authors require that repeating keys in messages be allowed and that the last key wins. This precludes SAX-like parsing of the payload, since no processing can take place before the entire payload is received.
* Protobuf is rife with other useless properties, added exclusively to support Google's use-cases. Various containers for primitive types to make them nullable. JSON conversion support (that doesn't work all the time because it relies on undocumented naming convention).
* Protobuf payload doesn't have a concept of version / identity. It's possible, and, in fact, happens quite a bit, that incorrect schema is applied to payload, and the operation "succeeds", but, the resulting interpretation of the message is different from intended.
* The concept of default values, that is supposed to allow for not sending some values is another design flaw: it makes it easy to misinterpret the payload. Depending on how the reader language deals with absence of values, the results of the parse will vary, sometimes leading to unintended consequences.
* It's not possible to write a memory-efficient encoder because it's hard / impractical sometimes to calculate the length of the message constituents, and so, the typical implementation is to encode the constituents in a "scratch" buffer, measure the outcome, and then copy from "scratch" to the "actual" buffer, which, on top of this, might require resizing / wasting memory for "padding". If, on the other hand, the implementation does try to calculate all the lengths necessary to calculate the final length of the top-level message, it will prevent it from encoding the message in a single pass (all components of the message will have to be examined at least twice).
----
Had the author of this creation tried to use it for a while, he'd known about these problems and would try to fix them, I'm sure. What I think happened is that it was the first ever attempt for the author in doing this, and he never looked back, switching to other tasks, while whoever picked up the task after him was too scared to fix the problems (I hear the author was a huge deal in Google, and so nobody would tell him how awful his creation was).
Since you mention Maven I'm going to make the assumption that you are using Java. I haven't used Java in quite a while. The last 8 years or so I've been programming Go.
Your experience of gRPC seems to be very different from mine. How much of the difference in experience do you think might be down to Java and how much is down to gRPC as a technology?
Google also uses a completely different protocol stack to actually send Stubby/Protobuf/gRPC around, including protocols on the wire and bypassing the kernel (according to open access papers about PonyExpress etc)
As someone that used it for years with the same problems he describes... spot on analysis, the library does too much for you (e.g. reconnection handling) and handling even basic recovery is a bit a nuisance for newbies. And yes, when you get random failures good luck figuring out that maybe is just a router in the middle of the path dropping packets because their http2 filtering is full of bugs.
I like a lot of things about it and used it extensively instead of the inferior REST alternative, but I recommend to be aware of the limitations/nuisances. Not all issues will be simply solved looking at stackoverflow.
What about songle directional streams? Graphql streams aren't widely supported yet are they? Graphql also strikes me as a weird alternative to protobufs as the latter works so hard for performance with binary payloads, and graphql is typically human readable bloaty text. And they aren't really queries, you can just choose to ignore parts of the return for a rpc.
"A breakpoint for logging is usually scalability, because they are expensive to store and index."
I hope 2024 is the year where we realize that if we make the log levels dynamically update-able we can have our cake and eat it too. We feel stuck in a world where all logging is either useless bc it's off or on and expensive. All you need is a way to easily modify log level off without restarting and this gets a lot better.
That's probably true for some uses of logging, but for information about historical events you're stuck with whatever information you get from the log level you had in the past.
If you're writing something that will run on someone's local machine I think we're at the point where you can start building with the assumption that they'll have a local, fast, decent LLM.
> If you're writing something that will run on someone's local machine I think we're at the point where you can start building with the assumption that they'll have a local, fast, decent LLM.
I don't believe that at all. I don't have any kind of local LLM. My mother doesn't, either. Nor does my sister. My girl-friend? Nope.
Jeffrey & I build a rate limiter in Ruby:
- visualizing a rate limiter
- leaky bucket vs token bucket
- capacity vs rate
- dynamic rate limits
- use cases for rate limits
Saying gRPC hides the internals is a joke. You’ll get internals all right, when you’re blasting debug logging trying to figure out what the f is going on causing 1/10 requests to fail and fine tuning 10-20 different poorly named and timeout / retry settings.
Hours lost fighting with maven plugins. Hours lost debugging weird deadline exceeded. Hours lost with LBs that don’t like the esoteric http2. Firewall pain meaning we had to use Standard api anyway. Crappy docs. Hours lost trying to get error messages that don’t suck into observability.
I wish I’d never heard of it.
reply