That's what you get for GraphQL not having an algebra. If it had an algebra you ...

obi1kenobi · on May 6, 2022

If you want to see what a GraphQL with an algebra could look like, I built one! The query language is parsed with a vanilla GraphQL parser, but has directives like `@filter, @recurse, @optional` etc.

10min talk video: https://www.hytradboi.com/2022/how-to-query-almost-everythin...

GitHub: https://github.com/obi1kenobi/trustfall

mumblemumble · on May 6, 2022

This seems like a "be careful what you wish for" situation.

Sure, you could set up an algebra that allows you to handle arbitrary queries for zero extra programmer effort, just like a SQL database engine does. And then you could even expose it to users, and let them execute arbitrary queries.

And then, later, after you're done cleaning the molten slag off the server room floor, you could stop and reflect on whether that was really such a necessary thing to do.

PaulHoule · on May 6, 2022

If you had a rigorously defined system you could put rigorous limits on it.

If it's not rigorously defined there are no limits, just what people can get away with.

With GraphQL you get the worst of both worlds that people can't write arbitrary queries but they can still trash the system. At least with undefined semantics people don't need to argue about whether or not they got the right answers.

ryanbrunner · on May 6, 2022

"Rigorous limits" for a sufficiently large database means "uses our hand-picked indexes effectively", which reduces down to "provides the same functionality as a REST API" since you need basically a whitelisted list of acceptable operations. At best you can reduce transfer time by limiting columns returned, which is something but not really worth the added complexity.

lubesGordi · on May 6, 2022

I guess I've always assumed the graphQL would be a nice way of implementing a rest api, not something you'd expose to the customer directly.

mumblemumble · on May 7, 2022

My experience trying to maintain databases that are directly exposed to multiple development teams tells me that even exposing a fully generic querying API internally is risky.

Which, just for context - that's not me saying "graphQL is bad", it's me saying, "graphQL making it hard to do that is a feature, not a bug."

striking · on May 6, 2022

Ok, use https://github.com/join-monster/join-monster. If you need autogeneration from the DB instead of hand-curated joins defined on the schema, consider https://www.graphile.org/postgraphile/ or https://hasura.io/.

dustingetz · on May 6, 2022

hand rolling a custom query engine - the exact opposite of what every business wanted when the engineers sold it graphql

obi1kenobi · on May 6, 2022

Why hand-roll one when you can use one that's already available and thoroughly tested :)

https://github.com/obi1kenobi/trustfall

(Hi Dustin!)

bfz · on May 6, 2022

I've yet to encounter a GraphQL off-the-shelf server (from Python and JS spaces) where hitting a slow query didn't immediately turn into half a day's work

The whole concept is what happens when you let a smart person work on a small problem for far, far too long

obi1kenobi · on May 6, 2022

I'd recommend checking out the project link in the comment to which you replied. It is designed _specifically_ to avoid the problem you mention: instead of a fully materialized, fully-nested result, it returns flattened row-oriented results (like a SQL database).

This allows for lazy evaluation i.e. rows are produced only as they are consumed. So if you accidentally write a query that would produce a billion rows but only load 20, the execution of the query only happens for 20 rows + any batching or prefetch optimizations in the adapter used to bind the dataset to the query engine.

PaulHoule · on May 6, 2022

It is a fundamental problem of a "graph".

(1) There are usually some nodes of very high degree and traversing those nodes will explode your query, (2) if you are following N links and the average degree is d, you are going to come across dᴺ nodes and that is a lot of nodes as N gets big!

Tim-Berners Lee told me that if you can't send the whole graph you should send a subset of the graph that contains the most important facts.

It's a right answer but also a frustrating one to a programmer who sees correct implementation of algorithms to mean that you get the ticket done and they don't come at you with a ticket about it again. That is, that query I'm writing is part of an algorithm that depends on getting a certain answer and getting an uncertain answer for one query is like some spoiled milk that ruins the whole batch.

bfz · on May 6, 2022

> It is a fundamental problem of a "graph".

So why are we using it for so many naturally non-graph problems? 90%+ of developers' exposure to graphs is through tightly abstract interfaces, I could name maybe 3 graph-related algorithms off the top of my head, but could implement none of them without reading.

We could represent the text of this comment in a graph using one node for each unique character, but the result would be stupid, the operations would be slow, the representation needlessly complex, and implementations guaranteeably hard to work with

> Tim-Berners Lee told me that if you can't send the whole graph you should send a subset of the graph that contains the most important facts.

Indeed, I also caught the ReST buzz around the 2000-2003 timeframe, and turns out 20 years later nobody does that either, because in its purest form it's a pain in the ass for comparable reasons to the topic at hand

PaulHoule · on May 6, 2022

It's funny to see a blog post on HN almost every day where somebody rediscovers the power of columnar query answering engines which are almost the opposite of graph databases.

I've lost count of how many columnar SQL databases have been donated to the apache project and there are so many systems like Actian and Alteryx where data analysts hook together relational operators with boxes and lines.

I had a prototype of a stream processing engine that passed RDF graphs along the lines between the boxes that enable an "object-relational" model, you could eliminate the need for hard-to-maintain joins but I found that firms that had bought multiple columnar processing database companies believed in performance at all cost and couldn't care less for any system that couldn't be implemented with SIMD instructions.

eurasiantiger · on May 6, 2022

How are they opposite? There are plenty of graph databases out there using columnar storage, even ones directly compatible with GraphQL Federation. Best of both worlds, so to speak.

everforward · on May 6, 2022

> So why are we using it for so many naturally non-graph problems? 90%+ of developers' exposure to graphs is through tightly abstract interfaces, I could name maybe 3 graph-related algorithms off the top of my head, but could implement none of them without reading.

It's a reasonable abstraction for structuring related bits of data (like would go in a typical relational database), and that abstraction can align with the developer's mental model easier.

E.g. ORMs basically convert SQL data into an in-memory graph. Likewise, graph database APIs are natively more object-y; you follow the edge from child to parent, instead of making a bit of data the same in both tables and then querying matching rows.

They're not perfect, and shouldn't be used everywhere (nor even many places they currently get used), but I can see the appeal of abusing them.

eurasiantiger · on May 6, 2022

Because graphs are a good abstraction for relations and with the right tech choices, are much more manageable and malleable than traditional relational databases.

q-big · on May 6, 2022

> It's a right answer but also a frustrating one to a programmer who sees correct implementation of algorithms to mean that you get the ticket done and they don't come at you with a ticket about it again.

This rather sounds like a problem about the project manager and the project management methods that he uses.

PaulHoule · on May 6, 2022

No. I had a time in my career where I was the guy who finished projects that other people started and couldn't finish.

Some coders really don't have discipline and projects never get done because they don't think things throw and keep sending half-baked patches that get sent back by test or the customer.

The role of management is to get those people working for their competitor and then have the "fixer" move in.

contravariant · on May 6, 2022

Nah it's what you get for GraphQL only being an API which people inevitably conflate with the database itself (a harmful trend that probably started with SQL databases).

If you want to use GraphQL you should look for a database supporting it as an interface, or failing that look for an ORM system that supports GraphQL and whatever backend you want.

Trying to convert SQL to GraphQL or GraphQL to SQL is both equally difficult and has little to do with it not having an algebra (also I think most of it is just algebraic types, possibly lacking a proper sum type).

God forbid you should try to modify anything with GraphQL though, that part makes no sense whatsoever.

michael_j_ward · on May 6, 2022

This may interest you

https://www.edgedb.com/docs/edgeql/

spion · on May 6, 2022

See Hasura https://hasura.io/