Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> [...] sqlite is the 80% case and is also dead simple to get going and genuinely performant.

I don't understand this. PostgreSQL is ALSO dead simple to get going, either locally or in production. Why not just start off at 90%?

I mean, I get there are a lot of use cases where sqlite is the better choice (and I've used sqlite multiple times over the years, including in my most recent gig), but why in general?



I think "dead simple" is not doing anyone any favors when it is being used to try to equate the simplicity of things.

It's obviously a lot simpler to just have a file, than to have a server that needs to be connected to, as long as we're still talking about running things on regular computers.


I guess that's really my point here.

They difference in setup time is negligible so I'm not sure why people keep bringing it up as a reason to choose sqlite over PostgreSQL.

For instance, "deployable inside a customer application" is an actual requirement that would make me loath to pick PostgreSQL.

"Needs to be accessible, with redundancy, across multiple AWS zones" would make me very reluctant to pick sqlite.

Neither of these decisions involve how easy it is to set up.

It's like choosing between a sportbike and dump truck and focusing on how easy the sportbike is to haul around in the back of a pickup truck.


I'm not sure its neglibile, I suppose once you know what you're doing.

But postgres setup, at least the package managers on Linux, will by default, create a user called postgres, and lock out anyone else who isn't this user from doing anything. Yeah you can sudo to get psql etc. easily, but that doesn't help your programs which are running as different users. You have to edit a config file to get to work, and I never figured out how to get to work with domain sockets and not TCP


That's interesting... my experience (almost all on RHEL/CentOS/Fedora) is that it is trivial to have unix domain socket with local Postgres clients and a pain to setup any remote clients.

You just have to call the basic "createuser" CLI (or equivalent CREATE ROLE SQL) out of the postgres superuser account to create database users that match local Linux usernames. Then the ident-based authentication matches the client process username to the database role of the same name.


> Needs to be accessible, with redundancy, across multiple AWS zones

How many projects start with these requirements?


Anything with real customer data in the cloud? Certainly you need replication.


Litestream can handle realtime replication.

But most projects don’t even have customers when they start, let alone large quantities of their data and legal requirements for guaranteed availability.


I think it's reasonable for a business, even a freshly starting one, to expect to grow to the point where it does have enough customers to outgrow SQLite fairly soon. Between that and PG offering more features in general (which often simplifies app code, because you can do more with queries), it's still not clear why not start with PG in the first place.

PG, on the other hand, can scale enough to cover foreseeable future needs for most businesses, so aiming for something more complex than that is almost always premature.


> outgrow SQLite fairly soon

That would be the result of either vastly overestimating their business plan or vastly underestimating SQLite.


> How many projects start with these requirements?

In a world fueled by cheap money and expensive dreams, you'd be surprised.


I’m not counting on a cheap money future. Not near future at least.


> PostgreSQL is ALSO dead simple to get going

I'm not saying it's hard to set up Postgres locally, but sqlite is a single binary with almost no dependencies and no config, easily buildable from source for every platform you can think of. You can grab a single file from sqlite.org, and you're all set. Setting up Postgres is much more complicated in comparison (while still pretty simple in absolute terms - but starting with a relatively simpler tool doesn't seem like a bad strategy.)


Except for when your data is in it. Migrating data on a running app is one of the worst things to deal with. I can understand using something simple and cut down for other things, but the DB is not the place I'd want to do that. Postgres isn't exactly hard to get going with, and will grow with you easily, so why trade that off for saving an hour or two at the start of the project?


> Except for when your data is in it.

Replacing the DB before it gets any actual data inserted into it solves this problem. You just switch to Postgres before you go anywhere beyond staging, at the latest - in practice, you need Postgres-exclusive functionality sooner than that in many cases, anyway. Even when that happens, you might still prefer having SQLite around as an in-memory DB in unit tests. The Postgres-specific methods are pretty rare, and you can mock them out, enjoying 100x faster setup and teardown in tests that don't need those methods (with big test suites, this quickly becomes important).

Unless you really want to use Postgres for everything like the Gist here suggests, the DB is just a normal component, and some degree of flexibility in which kind of component you use for specific purposes is convenient.


Practically, because sqlite is good enough for one machine and compatible-enough with postgresql that you can use either pretty easily. One thing I wrote was an exactly-once stream processor that fetched events from a lot of remote systems for processing. Transaction-based queue in the DB to achieve exactly-once with recovery (remote systems accepted time-stamp resyncing of the stream of events). It works fine at small scale on a single machine for design and testing (local integration tests with short startup time are very valuable) but trivially scales to hundreds of workers if pointed at a postgres instance. The work to allow sqlite vs postgres was a single factory that returned a DB connection in Go based on runtime configuration.

It's also good practice for designing reasonably cross-database compatible schemas.


One use case where SQLite is a good option is for embedding as a local database in an app. Starting local-only with SQLite allows you to defer a lot of the backend effort while testing an MVP.


You might find https://github.com/electric-sql/electric pretty cool.


But PostgreSQL is not dead simple when compared to SQLite.


If dead simple involves me babysitting a service process then not it is not. SQLite has embedded version that requires no service out of process. That's what my definition of dead simple.


Postgres complicates the application in several ways. In particular, Postgres suffers from the n+1 problem, while SQLite does not. That requires a significant amount of added complexity in the application to hack around. Why over engineer the application before it has proven itself as something anyone even wants to use? Let's face it, the large majority of software written gets thrown away soon after it is created.

I already hear you saying that you know of a library that provides a perfect abstraction to hide all those details and complexities, making the choice between Postgres and SQLite just a flip of a switch away. Great! But then what does Postgres bring to the table for you to choose it over SQLite? If you truly prove a need for it in the future for whatever reason, all you need to do is update the configuration.


This is a misunderstanding of the n+1 problem, which is exacerbated by SQLite's deceptive phrasing of the issue:

> In a client/server database, each SQL statement requires a message round-trip from the application to the database server and back to the application. Doing over 200 round-trip messages, sequentially, can be a serious performance drag.

While the above is true on its own, this is _not_ the typical definition of n+1. The n+1 problem is caused by poor schema design, badly-written queries, ORM, or a combination of these. If you have two tables with N rows, and your queries consist of "SELECT id FROM foo; SELECT * FROM bar WHERE id = foo.id_1...", that is not the fault of the DB, that is the fault of you (or perhaps your ORM) for not writing a JOIN.


> that is the fault of you (or perhaps your ORM) for not writing a JOIN.

It's your fault for not writing a join if you need a join. But that's not where the n+1 problem comes into play.

Often in the real world you need tree-like structures, which are fundamentally not able to be represented by a table/relation. No amount of joining can produce anything other than a table/relation. The n+1 problem is introduced when you try to build those types of structures from tables/relations.

A join is part of one possible hack to workaround to the problem, but not the mathematically ideal solution. Given an idealized database, many queries is the proper solution to the problem. Of course, an idealized database doesn't exist, so we have to deal with the constraints of reality. This, in the case of Postgres, means moving database logic into the application. But that complicates the application significantly, having to take on the role that the database should be playing.

But as far as SQLite goes, for all practical purposes you can think of it as an ideal database as it pertains to this particular issue. This means you don't have to move that database logic into your application, simplifying things greatly.

Of course, SQLite certainly isn't ideal in every way. Tradeoffs, as always. But as far as picking the tradeoffs you are willing to accept for the typical "MVP", SQLite chooses some pretty good defaults.


> Often in the real world you need tree-like structures, which are fundamentally not able to be represented by a table/relation. No amount of joining can produce anything other than a table/relation. The n+1 problem is introduced when you try to build those types of structures from tables/relations.

I don't know how precisely strict you expect a tree to be in RDBMS, but this [0] is as close as I can get. It has a hierarchy of product --> entity --> category --> item, with leafs along the way. In this example, I added two bands (Dream Theater [with their additional early name of Majesty], and Tool), along with their members (correctly assigning artists to the eras), and selected three albums: Tool's Undertow, with both CD and Vinyl releases, and Dream Theater's Train of Thought, and A Dramatic Turn of Events.

The included query in the gist returns all available information about the albums present in a single query. No n+1.

The inserts could likely be improved (for example, if you were doing these from an application, you could save IDs and then immediately reuse them; technically you could do that in pl/pgsql, but ugh), but they do work.

This is also set up to model books in much the same way, but I didn't add any.

> A join is part of one possible hack to workaround to the problem, but not the mathematically ideal solution.

Joins are not a "hack," they are an integral part of the relational model.

[0]: https://gist.github.com/stephanGarland/ec2d0f0bb54161898df66...


> Joins are not a "hack," they are an integral part of the relational model.

Yes, joins are an essential part of the relational model, but we're clearly not talking about the relational model. The n+1 problem rears its ugly head when you don't have a relational model – when you have a tree-like model instead.

> The included query in the gist returns all available information about the albums present in a single query. No n+1.

No n+1, but then you're stuck with tables/relations, which are decidedly not in a tree-like shape.

You can move database logic into your application to turn tables into trees, but then you have a whole lot of extra complexity to contend with. Needlessly so in the typical case since you can just use SQLite instead... Unless you have a really strong case otherwise, it's best to leave database work for databases. After all, if you want your application to do the database work, what do you need SQLite or Postgres for?

Of course, as always, tradeoffs have to be made. Sometimes it is better to put database logic in your application to make gains elsewhere. But for the typical greenfield application that hasn't even proven that users want to use it yet, added complexity in the application layer is probably not a good trade. At least not in the typical case.


n+1 can show up any time you have poorly modeled schema or queries. It’s quite possible to have a relational model that is sub-optimal; reference the fact that there are 5 levels of normalization (plus a couple extra) before you get into absurdity.

I still would like to know how SQLite does not suffer from the same problems as any other RDBMS. Do you have an example schema?


> I still would like to know how SQLite does not suffer from the same problems as any other RDBMS.

That's simple: Not being an RDMBS, only an engine, is how it avoids the suffering.

The n+1 problem is the result of slow execution. Of course, an idealize database has no time constraints, but the real world is not so kind. While SQLite has not figured out how to defy the laws of physics, it is able to reduce the time to run a query to imperceptible levels under typical usage by embedding itself in the application. Each query is just a function call, which are fast.

Postgres' engine can be just as fast, but because it hides the engine behind the system layer, you don't interact with the engine directly. That means you need to resort to hacks to try and poke at the engine where the system tries to stand in the way. The hacks work... but at the cost of more complexity in the application.


Compare and contrast the query execution stages of PostgreSQL and SQLite. How exactly do they work? Please be as precise as possible. Try to avoid imprecise terms like "simple", "suffering", "system layer", and "hack."


For what purpose? I can find no source of value in your request.


To prove that you have an inkling about the subject you have wandered into.

Engineers and Scientists do that; they don’t hide behind airy and vague language – that is the realm of conmen.

You stated that you cannot implement a tree-like structure with tables, so I set about proving you wrong, and posted a gist that does so.

Until you can back up your claims with data, your words are meaningless, and no one here is going to take you seriously.


Prove to who? You?

Engineers and scientists are paid professionals. They are financially incentivized to help other people. Maybe you somehow managed to not notice, but I am but a person on an Internet forum. There is no incentive offered for me to do anything for anyone else. To have the gall to even ask for me to work for you without any compensation in kind is astounding.

Of what difference does it make if anyone takes me seriously or not? That's the most meaningless attribute imaginable. Oh noes, a random nobody on the internet doesn't believe me! It's the end of the world as we know it... How droll.

Perhaps it is that you do not understand what value means? Or how did you manage to write all those words and not come up with any suggestion of value whatsoever?


To educate your adoring fans


But for what purpose? There is no value in educating adoring fans.


> Often in the real world you need tree-like structures, which are fundamentally not able to be represented by a table/relation. No amount of joining can produce anything other than a table/relation. The n+1 problem is introduced when you try to build those types of structures from tables/relations.

You can easily get hierarchical output format from Postgres with its JSON or XML aggregate functions.

You can have almost all benefits of an embedded database by embedding your application in the database.

Just change perspective and stop treating Postgres (or any other advanced RDBMS) as a dumb data store — start using it as a computing platform instead.


We are still dealing with the fact that SQLite still HAS the n+1 "problem", it's just fast enough that it doesn't suffer from it.

It's a very important distinction because, as you say, there are problem domains where you can't just "join the problem away".


Are we? "Suffer" was the word used right from the beginning for good reason. Computers aren't magic. I find no importance in pointing out that fact. Presumably everyone here already knows that. And if it turns out that they don't, who cares? That's their problem. Makes no difference to me.


I agree that suffer is the right word, but unclear. You are getting down voted because a lot of people are interpreting to mean you are saying applications using sqlite don't have n+1 queries.


1. At time of writing, there has been one downvote in the first comment, followed by one upvote in the subsequent comment. Not only does that not translate to "a lot of people", it was quite likely just one person. And unless that person was you, it is impossible to know what their intent was. I'm not sure what are you trying to add here.

2. Who gives a shit? If the "computers truly are magic" camp don't understand what I wrote, great! It wasn't written for them in the first place. If that compels them to use their time pressing a button instead of learning about how computers work, great! Not my problem. I'm not sure what you are trying to add here.


> Who gives a shit? [...] I'm not sure what you are trying to add here.

I guess nothing. You must be fun at design reviews.


Well, let's hope the "computers are magic" riff-raff never show up at the design reviews. Especially if they expect someone to explain to them the basics of computing without any reasonable offer of compensation in return.

If those people show up here and put on a tantrum by pressing random buttons or whatever it was that you were trying to point out... Oh well? I wouldn't have even noticed if you didn't bring it up.

What value you found in calling attention to their pointless tantrums is an interesting mystery, though!


If you need to query over a tree data structure, then that's what WITH RECURSIVE is for, and it's present in both DBMS.

If you additionally need the result of that query to be hierarchical itself, then you can easily have PG generate JSON for you.


> No amount of joining can produce anything other than a table/relation. The n+1 problem is introduced when you try to build those types of structures from tables/relations.

A trivial amount of lateral joins plus JSON aggregates will give you a relation with on record, containing a nested JSON value with a perfectly adequate tree structure, with perfectly adequate performance, in databases that support these operations.

There are solutions to these problems. One only needs to willingness to accept them.


Of course. I know reading is hard, but nobody has suggested there aren't solutions. Obviously there are, else Postgres would be pretty much unusable. But the question remains: Why resort to hacks that require extra complexity in the application when you can just as easily use the database as it was designed to be used? What are you getting in return?


> Postgres suffers from the n+1 problem, while SQLite does not.

?


Meaning insertion sort does not suffer from the problems that made us construct quicksort because we optimized the insertion a lot (at a tradeoff that we will not specify).


Indeed.


It’s worth elaborating:

N+1 Queries Are Not A Problem With SQLite

https://www.sqlite.org/np1queryprob.html#:~:text=N%2B1%20Que....


it's much bigger and requires running a server, that's why I use sqlite3, and my needs are 99% modest most of the time.


It requires running a separate process.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: