> SQLite is not open source in the legal sense, as “open source” has a specific definition and requires licenses approved by the Open Source Initiative (OSI).
This is wrong, and harmfully wrong. OSI are not the arbiters of open source. Their Open Source Definition, though generally useful and accepted, is not without legitimate criticism and controversy. As for their approval, that’s a dreadful thing to rely on for any purpose; <https://writing.kemitchell.com/2019/05/05/Rely-on-OSI.html> is a good description of most of what’s wrong with it (it doesn’t really get into the broken politics enough; but some of his other articles contain more), and I like its summary: “The list of OSI-approved licenses reflects OSI’s practical and political history, not any useful, consistently functional category of license terms.”
As for whether SQLite is open source, well, the only reason a public domain dedication doesn’t meet the OSD is that it’s not a license. It’s more open. In a way that is legally mildly uncertain in some jurisdictions, sure, but to call it “not open source in the legal sense” is just wrong.
OP is wrong, SQLite does match the requirements outlined in the OSI's definition: https://opensource.org/osd
It's not "open source licensed" simply because it is not licensed. There is no license document for OSI to approve.
Relying on OSI approval does not necessarily make sense, they don't get to all licenses as your link points out. However arguing over their definition seems very foolish. People only do it when they want to pass their all-rights-reserved code as "open source" (e.g. "I can call it what I want"). Having adjectives that mean the same to everyone is very valuable, actually.
> Having adjectives that mean the same to everyone is very valuable, actually.
Yes, and that's precisely why OSI should have invented a brand new term which they could trademark. But instead they co-opted "open source" which already had a clear, generic meaning of "the source code is available", previously without any connotations around specific license terms. And then due to that prior usage of the term, OSI can't trademark it.
OSI created this terminology mess, and personally I don't think it is "foolish" to argue with their historical revisionism campaign around it.
I understand your point, but I am not sure a new term wouldn't have been co-opted (if it was any good). It would certainly have hurt adoption (if it was not as good).
I also like the current terminology better tbh ("open", "available").
If they had invented a new term, then they could have obtained a trademark for that term, which would properly allow them to enforce their definition and ability to certify licenses as open source. That's usually how this works, for example POSIX is a trademark of the IEEE. (the other possible route is enforcement/certification by a government agency, for example organic food must meet certain farming practices to be called "organic".)
Instead we have a situation where they didn't invent the term but they lie about it and said they did. They don't have a trademark and obviously aren't a government agency, so enforcement is left to random people acting as self-appointed terminology police on their behalf. And then we get huge subthreads like this one, where a bunch of people have to argue about whether one of the most successful open source projects in history is really even "open source" :)
I am saying that describing a code base as "open source" was intuitive and generically understood to mean "source available" prior to 1998, which is when the OSI co-opted the term and conflated it with a much more specific definition involving license restrictions.
As for the pre-1998 meaning not being "clear", that's a matter of opinion. I would counter that OSI choosing to redefine an already existing term, and then falsely claim they invented that term, makes things far less clear.
Please read the link I included above. That post author's findings completely align with my own recollection from discussions on BBS's and newsgroups in the 90s.
The fact that you had to change your use of the term back in the 90s is deplorable but is it really any argument for going back?
I for one like where we are (which is not an endorsement of OSI as an organization, just of their definition). The terms are short and universally understood, there is a wide range of licenses to choose from, some of them even got a yay/nay from lawyers.
I feel that the current status quo around OSI-approved licenses is largely beneficial to incumbents ("big tech" / cloud providers) and is generally harmful for software innovation industry-wide.
A fair number of vocal people in the industry express contempt for anything using a non-OSI license, even though many of these licenses are simply trying to empower a sustainable business model so the work can't be stolen and abused by cloud providers.
If you independently create a truly innovative new infrastructure software product today (or during any other non-ZIRP / non-boom-time period) with the intention of forming a company around it, you are faced with several bad options around licensing and source availability:
* If you adopt an OSI-approved license other than AGPL, and your product is successful, larger incumbents can and will steal your work and capture most of its value. You will either go out of business, or eventually need to re-license, causing mass outrage and accusations of rug-pulling.
* If you choose AGPL, most VC-funded companies won't touch your software and statistically it will almost certainly fail commercially as a result. Additionally there are several concerning theoretical loopholes in the AGPL, as well as conceptual problems with a non-EULA nonetheless trying to enforce restrictions affecting end users.
* If you choose an entirely close-sourced model, your chance of success is very very low, especially for B2B products. You're a small/new vendor, other companies won't trust you.
So the logical conclusion is you should choose a non-OSI license which prevents the cloud providers from stealing the profits from your hard work. But then random people will attack you everywhere for "open washing" and not being "open source", even if you don't use the word "open" anywhere. People have seemingly been trained to have a knee-jerk negative reaction to SSPL, BSL, etc and that's a really bad state of affairs.
It's especially problematic that the OSI is a closed-membership entity, and not an open-membership professional association. There is no democratic way for software engineers to influence its policies. As an industry we've effectively ceded control over what licenses are acceptable to a random small non-profit. And as a result of that, it's harder today to be a successful independent software vendor than it was in the 90s, which is an absurd state of affairs.
My point was that accepting OSI’s terminology maintains a status quo that is harmful to innovation and harmful to independent software developers. You’re ceding control over what licenses are acceptable to a closed-membership non-democratic entity that isn’t a professional association. I choose not to do that by rejecting their terminology as a necessary first step.
You can easily release something with a shared source/source available license that doesnt make it open source. You can also dual license gpl/proprietary use license
> release something with a shared source/source available license that doesnt make it open source
I already described problems with that approach up-thread. People attack software that uses non-OSI-approved licenses and accuse it of "open washing" even if it avoids the term "open". Or they attack it for not being "open source" and refuse to use it, even if they're not working at a cloud provider or competitor, and are completely unaffected by the extra license terms.
In any case you're repeatedly saying "open source" when you actually mean "OSI-approved", which is the entire thing I'm railing against here.
> You can also dual license gpl/proprietary use license
That solves literally none of the cloud provider problems I've described above.
> OP is wrong, SQLite does match the requirements outlined in the OSI's definition:
That is true for most of us, but there are jurisdictions that do not recognise the public domain in the same way. SQLite's license page mentions this and offers licensing for people who need to deal with this. https://www.sqlite.org/copyright.html
I skimmed over both of those links and it seems that there are in fact no aspects of the definition that could be challenged with public domain software. The second one is all over the place (the author even says he's not a lawyer!) but it seems focused on the fact that public domain rights are not recognized the same in every country. SQLite has other licenses you can use for those situations. The idea that this project is not truly open source seems silly.
I will try to rephrase it and include links to Wikipedia and OSI
> As for whether SQLite is open source, well, the only reason a public domain dedication doesn’t meet the OSD is that it’s not a license. It’s more open.
That's true and I do acknowledge this in the post:
> Instead, SQLite is in the public domain, which means it has even fewer restrictions than any open source license.
I'm willing to let OSI own "Open Source", cause why not (do they have a trademark on "Open Source"?), but not "open source". All-lowercase "open source" includes public domain, I don't care what OSI says about it, while "Open Source" is whatever OSI says it is, and if that does not include public domain then so be it.
Capital letters are only useful as ECC, not direct signal. (You can't pronounce them, they aren't preserved in many contexts, the lay writer won't respect them, etc).
Your idea is like saying "Google" can be a trademark owned by Alphabet but "google" is just a verb. It seems like a cute linguistic hack but won't be meaningful in the public discourse.
> Capital letters are only useful as ECC, not direct signal. (You can't pronounce them, they aren't preserved in many contexts, the lay writer won't respect them, etc).
In speech I clarify if and when needed, naturally.
> Your idea is like saying "Google" can be a trademark owned by Alphabet but "google" is just a verb. It seems like a cute linguistic hack but won't be meaningful in the public discourse.
You have it exactly backwards. Descriptively "to google" is a verb.
Descriptivism is much better than prescriptivism in natural language. Sure, we need to use prescriptivism when teaching the language, but we do get to evolve the language. That's just how it goes and has gone for the entire recorded history of mankind.
The fact that it’s a different word makes it very different situation. It’s not the capital letter that signals this. It’s the context. Polish as a verb and Polish as a noun won’t be mistaken in a sentence.
In “Polish my boots, please” we know it’s a verb, even though it starts with a capital letter. The capital letter does not ever signal meaning. That’s because a language is primarily spoken and there are no capital letters there.
Quite an interesting rabbit hole to explore. Apparently OSI holds a US trademark for “Open Source Initiative Approved License”, but no trademark for “open source” by itself.
The term definitely preceded the organization. The term was in use for years before OSI was formed. And OSI was clearly formed in order to provide some clarity to what was meant by “open source”. In other words, if there was no ambiguity in the phrase “open source”, OSI would never have been created.
And the best way to handle this ambiguity is to use "Open Source" when referring to OSI's definition, and "open source" when referring to the colloquial and ill-defined form.
Also, I believe that public domain as a concept does not exist in some countries. if i remember correctly the concern is that without a license the original author could change their mind and sue.
But any public domain software also matches the fsf and osi definition of free software and open source respectively. Atleast in jurisdictions where public domain exists.
You may be confusing things with copy-left, which is designed to protect users from developers. The gpl is a copy-left license.
There is an argument that without the copyleft provisions of the GPL (sometimes described as a virus), free software would not be as popular and important as it is today.
Humanity also was described as a virus, see for example The Matrix. Also capitalism, communism, socialism, any political system, religions, philosophies, cultures, and anything you might relate or hate at mental level.
Even the idea that whatever we dislike that gains more than zero popularity should be compared to a virus, this meme spreads like a virus.
You are 100% correct. Open source has a common definition the vast majority of the world uses, a company can proclaim their own definition, but it doesn't magically change the language. Nobody is thinking about, or even aware of, the OSI, when using the term. The correct term people should be using is "OSI approved", or something similar, although that's no more meaningful or useful as a "bob approved" license, where bob is some random nobody.
The thing is, in many corporate and government settings, OSI is trusted by default as "this is open source", and anything not explicitly covered by OSI will cause you a world of pain with your legal department to get an exception.
Edge case. In every company on the surface of this continent people import libraries and run software with no idea what is under the hood and whether they are compliant with all the applicable license agreements. Some clumsy attempts are made for major systems in some companies, but again, edge cases.
It's not that they don't understand; it's that it causes more work for the lawyers, because someone has to review the license if it's not one of the standard boilerplate acceptable licenses.
I once had to go to lawyers to get a license approved on a lib that went something along the lines of "this work is in the public domain; do whatever the fuck you want with it, just don't come crawling to me for help". I'm paraphrasing from ~20-year-old memories here, but I do distinctly remember the profanity. It elicited a chuckle from the lawyer and something along the lines of "I wish all of these were this simple".
JSLint had a standard open source license with an addition that said “The Software should be used for Good, not Evil.”
IBM wanted to use it but their lawyers balked at the added restriction. They wrote to the author to see if it could be removed. The author wrote back with, “I give permission for IBM, its customers, partners, and minions, to use JSLint for evil.”
If you're doing this sort of license analysis, try a license detector I built. It tries to guess which license was used and diff any changes. Lots of licenses are small changes from others, so it usually makes multiple guesses.
They understand quite well that it's not really something you can just neatly place a piece of work into. Something which is 'public domain' but the copyright has not actually expired on is higher risk to use than something with an explicit open source license.
You commented before reading...Unless the author updated the blog in the meanwhile.
Because point 6 says: "Instead, SQLite is in the public domain, which means it has even fewer restrictions than any open source license."
This is an interesting case to me and thus my questions. Don't feel you need to answer to some stranger on the internet interrogating you.
Public domain is largely a US government concept, I think the was the main reason that sqlite was put "in the public domain" it started and was initially payed for as an internal database for some military contract. So when Dr. H wanted to release it to the public he did it in the normal US government method which was public domain.
Would you legally be able to use nasa papers or programs? these are also in the public domain.
A short essay on public domain, because I am feeling a bit philosophic this morning.
As far as I know public domain is a concept that originated in the US when it was a new country(probably wrong and US centric). The question was who owns works created by a government "of the people", the answer was "the people" so it was formalized that US government works were not protected by copyright[note 1] and were in the "public domain" at first this was just laws, and when government research organizations(nasa etc..) were founded their works, the research they did, was also "public domain"
It really is an open question whether private citizens can also declaim any copyright on a work, declare it "for the public good" and put it in the public domain. Are you allowed to give up your rights?
1. The US government is able to protect it's works through mechanisms other than copyright, think classification levels. top secret etc..
As for NASA papers and such... I have no idea. If I were running a company I'd be able to ask a lawyer; as it stands, I'm taking the risk that the SQLite authors don't sue.
Making your work public domain is not "giving up your rights". It would be the exact opposite: practicing your right to transfer your IP. Being able to transfer or sell your IP would be an important right.
Similar to physical property: being able to sell it or to give it to the public is an important right. (Okay, you probably can't give your American land to everyone in the world, but only for practical reasons, and not for some fundamental reason like that would be "giving away your rights". No, that would be the opposite: your land, you do whatever you want with it.)
Over in Australia, the question of whether private citizens can also declaim any copyright on a work is resolved: We can't.
We're allowed to use public domain works, but we cannot declaim the work. Your copyright has a fixed term, which outlives you. Fin. You cannot surrender a right - as that has caused abuse issues in the past.
No right can be surrendered, but some may be suspended temporarily by a narrow set of laws I wish was even narrower.
Unfortunately... This does mean we can't work on Public Domain projects (legally), as we keep our copyright to our changes. Though practically, that has only been contested once in the history of the nation. The intent of the individual tends to be respected by our Common Law system.
i.e. PD is a signal the creator wants a free and boundless reuse of the product, even if it isn't strictly in the public domain.
I think public domain is a much more universal concept, as the state to which works revert to after copyright expires. What's more US-centric is the idea of deliberately placing a work into public domain prematurely.
I don't see how it could possibly could be illegal in any jurisdiction.
The only part of the US public domain that won't fly in many countries is the waiver of moral rights (mainly the right to attribution). Since my country does not allow this waiver, the authors might be able to sue me in my jurisdiction if I started claiming to be the author of the code, which would have been ok in the US. But the voluntary non-exclusive transfer of usage, modification and distribution rights is surely allowed in every jurisdiction, so there is no reason that that part of the dedication would not be valid.
And if there exists a jurisdiction where the transfer of material rights isn't allowed at all, selling a license also wouldn't fix anything. Unless there's a jurisdiction out there somewhere that allows the transfer only in exchange for money or in a one-on-one contract, but then open source licenses also wouldn't apply there and I feel like I would've heard about it as a fun fact somewhere on the Internet by now...
As far as I can tell, yes. It's a grey zone, because it'd fall under civil law, not criminal; so SQLite's copyright owners would need to bring suit against me for it to matter, which I don't believe they will.
However, on the face of it I have no right to use the software without buying a license.
Would it be possible for someone within a compatible jurisdiction to mirror SQLite3 and provide it under some license such that it could be used by anyone?
This blog feels like karma farming. Recycled, old points in listicle format on a popular topic with questionable accuracy.
This is on top of mixing in grievances the author's startup holds against Dr Richard Hipp, like "won't accept (our) outside contributions" and "not actually open source according to OSI".
As far as content goes, this listicle is probably great for view counts/engagement/flamewars. I'd personally prefer deeper thinking which this blog's previous posts-- high in hype and low in technical rigor-- have not yet provided.
> Recycled, old points in listicle format on a popular topic with questionable accuracy.
Could you please state which are inaccurate? I am happy to correct them.
As for the rest of the comment, well, I don't know what to say. I am a beginner in databases and I am journaling the things I'm learning. Some of my posts might not have depth because I don't know much myself.
>There are over one trillion (1000000000000 or a million million) SQLite databases in active use.
The original source[0] (which you don't link to):
>Since SQLite is used extensively in every smartphone, and there are more than 4.0 billion (4.0e9) smartphones in active use, each holding hundreds of SQLite database files, it is seems likely that there are over one trillion (1e12) SQLite databases in active use.
Sources at the end without any references to them in the text is not a good way to cite things.
There’s no good reason to use images in your post, other than the one graph. Images of text are a necessity on Twitter, but this is a real web page, where you can just have plaintext. Images of text are completely useless for readers with any accessibility needs.
I am fairly certain that Dr. Hipp has discussed changes in SQLite that were desired by Microsoft for integration into Windows, which came to pass via a foundation membership.
I believe that this was mentioned in the video below (I am not able to verify for now):
In the OP's defense, I see this usage increasingly, as in designating any combatant ship as a "battleship" (even being used for sailing ships). It's not correct, but then neither is "alright" in place of "all right": the language changes whether we want it to or not.
I don't know if this is an inaccuracy per se, but it seems like (7) and (8) are in contradiction. One says they don't accept outside contributions and the other says, "Contributing to SQLite is invite-only".
> I'd personally prefer deeper thinking which this blog's previous posts-- high in hype and low in technical rigor-- have not yet provided.
Really? Something like this[1] (literally the previous post made by the author!) is "high in hype and low in technical rigor"? Or do you just have an axe to grind?
This comment feels like "old man yells at clouds". Vague, non-specific points about the accuracy of the article. This is on top of mixing in grievances with the author non-judgmentally listing and citing facts about SQLite's contribution model and license.
As far as the content goes, this comment is probably great for view counts/engagement/flamewars. I'd personally prefer deeper thinking, which this author's previous comments-- usually fun and interesting-- have provided ;)
> SQLite does not have a Code of Conduct (CoC), rather Code of Ethics
There was confusion over this, because of different usage of words. Simplifying well beyond the point of strict accuracy, a CoC is a weapon to bind and control external contributors’ behaviour, the CoE is SQLite developers declaring their intended conduct towards others.
> SQLite is pronounced as “Ess-Cue-El-Lite”.
This doesn’t match the quote that follows, which says “S-Q-L-ite, like a mineral”. And that’s just how one guy chooses to pronounce it… I wonder how many others do; certainly I’ve never heard it.
I also never heard it. In Brazil I only have heard SQL pronunced as sequel in NoSQL. SQLite is mostly pronounced S-Q-Light here, with the S and Q in Portuguese (weird, but it happens often)
Guilty as charged. Pronouncing letters and numbers in English requires an active mental effort for me, so that happens if I'm not paying attention. I guess it's because they don't really register as another language, even with context. My inner monologue also does letters and numbers in Portuguese, which doesn't help either.
Some time around 2008-2012, I think, my habit shifted from saying S Q L to SEQUEL in most cases. Even internal voice, when thinking/reading, still says SEQUAL now. Even things like My S Q L became My SEQUEL.
I feel like I made that shift due to how people around me, and how public speakers, were aligning on it too, it felt to me that the nomenclature was unifying a bit as linguistics evolve and change a bit over normal course of time. I remember the shift was not easy and I had to put intention into it and it took a bit of time to make it feel natural. Although, I’ve never thought about this too deeply and as I write it I’m Not sure how others would respond to my memory of it. I’m sure it’s very regional, and now age dependent as I’m sure if I was passed mid career at that time i wouldn’t have budged on my preferred terminology but also there’s many people who maybe are just too young to understand the time and trend that I remember. Or just my own experience and there was nothing really unifying at all.
I remember hearing an interview with Richard Hipp where he said he pronounced it as you do ("rhymes with kryptonite" I think he may have said, or perhaps "like a mineral").
I'd always pronounced it in my head S-Q-L-Lite up until that point, but I much prefer this other way that I'd never considered. It rolls off the tongue easier and adds a bit of fancy.
I have never heard anything else but S-Q-L, in multiple workplaces. What I was new to databases I started with the history lesson, but was quickly corrected to pronounce it S-Q-L.
I don't even know anymore, nor does anyone else I think
As an alternate datapoint, I've almost solely (by a 90% majority or more) heard "sequel" and "sequelite" (/'siː.kwəl/ and /'siː.kwəˌlaɪt/, respectively) across 12 years and 6 companies (from FAANG to YC startups in SF and NYC)
The OP failed to mention that SQLite does have opt-in strict tables that enforce types, you just need to do `CREATE TABLE name (stuff TEXT) STRICT`, see https://www.sqlite.org/stricttables.html
Like journal mode being ROLLBACK by default instead of WAL, foreign key constraints being off by default, tables being lax by default is part of SQLite’s dedication to backwards compatibility.
That sentence builds up on the previously mentioned sentence about types.
> It is “weakly typed”. SQLite calls it “type affinity”. Meaning you can insert whatever in a column even though you have defined a type. Strong typed columns are opt-in
and then I call it
> I hate that it doesn’t have types. It’s totally YOLO
That's simply self-contradictory, as written. You say that it has (opt-in) types, then say that it doesn't have types. You could say "I hate that it doesn't have types by default", but it would be even more accurate to say "I hate that it doesn't enforce types by default", since it does have types, both strict and not.
I worked with Richard Hipp on a project to integrate his query engine into a custom scripting language with persistent storage for embedded devices (think c#'s linq but backed by flash storage).
He was a pleasure to work with and it seemed he made a decent living just off of support contracts for projects like this. One of the few one-man-shops that really, really worked out.
I was still just a "kid" at the time (early 20s, I'm over 40 now for context) so I was just excited to be working with folks that seemed to know what they were doing (as I clearly didn't.) Every day I was excited to go to work and absorb as much as I could. Richard and the other seniors were never condescending and everyone was able to communicate the inner workings of their systems to a naive kid straight out of college.
The same project (nTAG Interactive LLC) also involved Brian Silverman who famously made a Babbage style mechanical computer capable of playing tick tac toe using only tinker toys for construction while he was still attending university. He also created the original Lego "computing brick" which ran Logo, a lisp-adjacent language, via interpreter/pcode vm he wrote using HC11 ASM (and later he ported to PIC for the "Cricket" robotics controller.) That brick project spun off into what is today Lego Mindstorms. So I've had the privilege of working with some very creative and intelligent folks over the years.
I had only stumbled into the opportunity by way of working as an assistant at my local university's robotics lab (to help pay for undergrad) when my professor thought I would be a good fit for the MIT Media Lab startup opening.
The job was a tremendously fun time and to this day I still think of it as my "favorite" work experience. Sadly, as with most startups, ours didn't make it when funding dried up, so the fun eventually ended. To be honest, I've been involved with a number of other early stage startups (and various larger size orgs) since and none have come close to the experience. There truly isn't anything better than being surrounded by folks who are masters at their craft and are willing to help you learn.
I am interested in a thing where the whole programming language / program stack and everything is stored in the memory so you can have a language where it can run from where it was paused , inspired by some comment on some other hackernews thread . I had spent some of my weekends trying it but no use
What you're describing is exactly how early versions of PalmOS worked as they just kept parts of the OS in static ram so you always continue applications exactly how you left them.
Did they stop thinking of performance gains as a priority after 2017? The improvements have been very weak since then. Just wondering if they ran out of low hanging fruit or if they just didn't think it was important anymore.
> Just wondering if they ran out of low hanging fruit or if they just didn't think it was important anymore.
It's very much likely that the low hanging fruit's been picked clean, rather than the latter.
Take for example: You're given a bog standard codebase with no performance optimizations. It can be for whatever application, library, or service you could think of.
Running down the list of (increasingly not) obvious improvements:
- Removing duplicate work
- Multi-processing & multi-threading
- (if supported) Async I/O to remove I/O blocking
- Substituting data structures for more compact representations
- Understanding CPU caches & increasing cache hits
- (Very high effort - only as a last resort) Move to a compiled high perf language (C, Zig, Rust, etc.)
- (If applicable) Eliminating pointer use in code to prevent cache misses
- (If applicable) SIMD vectorization
- (If applicable) GPU processing
And each one of the above can only be done a certain amount of times: Once the improvement's been made, you can't gain the same boost by implementing it again exactly as before.
This isn't even mentioning that there's a base amount of work that needs to be done for a given task: Adding 2 numbers together requires at least 1 add instruction in x86 assembly, and you can't have 0 instructions.
What we're seeing here is that SQLite's hitting the floor: They likely can't go lower than this without a breakthrough in algorithms.
My latest favorite fun fact about SQLite is that it is not only my favorite SQL database, but also my favorite NoSql database.
Since 2024, all my new database tables have only one column. Everything just goes into one single JSON column, which I always call "data".
SELECT
cities.data->>'name' city_name,
countries.data->>'name' country_name
FROM cities
JOIN countries
ON countries.data->'id' = cities.data->'country_id'
I looked around a bit if some NoSql database would make this easier. But it turned out no, even now that I only use Json everywhere, SQLite is still the best tool.
I’m curious what the reasoning behind having no id or key column. Do you use indexed expressions to index JSON fields? Do you use Rowid for certain queries? Or do you not bother with indexes?
My understanding is that NoSQL databases still have indexes and it seems like using SQLite as you demonstrate could be worse in that regard.
The entries in the data column already have an id field. What would be an upside of having another id in a column? It seems that only complicates things and has the potential for invalid states (different id in the id column than in the json field).
So far I have not used indexes because things are fast enough as they are. I would expect that when I need more speed, I can easily add indexes on expressions.
Why do you expect SQLite indexes on expressions to be worse than what NoSQL databases do for indexing?
I wouldn’t suggest duplicating the id field, but moving from the data field.
> Why do you expect SQLite indexes on expressions to be worse than what NoSQL databases do for indexing?
I don’t have a concrete reason. It’s just that indexes on expressions are not intended as the main index for a SQLite database to my knowledge. Having thought about it more and read the page on expression indexes more thoroughly[0], I think it’s probably unlikely to be a noticeable downside.
The only remaining downside is that the item being indexed might be outright missing, but it sounds like that’s basically a feature in your case, as you’re opting for flexibility.
I mean, sometime along the way you’re going to have something that consumes the data (otherwise, why bother keeping it in the first place), and that something will have certain expectations about the way things are structured.
It will probably become a good idea then to have some clue as to what structure used to be at one point or the other, and for that you’d want to keep track as to what got added/removed and when.
If it’s a very early stage in the development, and you don’t expect any of the current data to survive to the final version, I guess that’s fine. But when you have an actual running product that has to keep running, dealing with the multiple versions of the data scheme is a pain, and dealing with the multiple untracked versions of the data scheme is a pain squared.
I use a lot of JSON columns w/ SQLite and find a programming language type is a great way to specify schema for such a column without having to write any migrations. No one has written a sternly worded comment at me on the internet yet for adding or removing fields from my struct types without writing a separate migration file.
Lots of use-cases for SQLite are not like Big Iron SQL Database Of Record where every change must be tracked because it's a shared stateful single point of failure and there's hell to pay for mistakes or confusion.
Not quite. We are saying that adding `not_available_before: 2026`, or similar, ‘right away’ is usually not a good thing; therefore anything that helps with doing that is not something we desire.
It should be a multistep process. Yes, it will waste time now, but it will save it in a long run.
> So DRH asked the question: what if the database just worked without any server? This was an innovative idea back then.
Strange my recollection of the time was file based databases were much more popular. FoxPro, Access (jet) and Dbase where all in wide use in 90's and early 2000's and ran a lot of business software using network file shares instead of a database server.
There were also minimalist SQL interfaces to say BerkeleyDB and DBM around.
It was a rival of MySQL for a while. The article says "mSQL was the first low-cost SQL-based database management system" but notes citation needed. Certainly matches my memory though.
Interbase is still around and it can store the entire database in a single file or across multiple files. There are also 3 versions of it available: IBLite (in process DB engine), IB-To-Go (in process DB engine), and Interbase (server based).
Yeah this one stuck out to me. I had thought that SQLite was probably inspired by BerkeleyDB which came out in 1994. That’s key value not relational, but the idea of a local, embedded database was not new.
I've been using SQLite for quick prototyping and as a log dump for later aggregation, but one thing I've found is that it's easy to accidentally want multi-writer in today's multi-service world. That's repeatedly been my biggest stumbling block, and I wouldn't be surprised if it's the foremost reason for others to "outgrow" sqlite.
Single-writer is orthogonal to single-threaded. Single threaded systems can and do allow multiple concurrent writes to proceed, e.g. via interleaving, write batching/combining, asynchronous I/O.
> D. Richard Hipp (DRH) was building software for the USS Oscar Austin, a Navy destroyer. The existing software would just stop working whenever the server went down (this was in the 2000s). For a battleship, this was unacceptable.
Maybe it's a nitpick, but a destroyer is not a battleship. The latter weren't even in service in the 2000's.
As a Navy veteran, I agree with you. As a descriptivist, I can accept someone referring to a destroyer, carrier, gator freighter, or anything else that carries lots of armament a "battle ship".
> Since SQLite is used extensively in every smartphone, and there are more than 4.0 billion (4.0e9) smartphones in active use, each holding hundreds of SQLite database files, it is seems likely that there are over one trillion (1e12) SQLite databases in active use.
I find the scientific notation of the counts amusing given the sometimes different meanings of "billion" and "trillion" (especially with English as a second language):
As a Catholic, I appreciate how Christian it is of course but I would be surprised if other people, especially atheists felt the same. I’m wondering how it hasn’t come under scrutiny yet?
What's the reasoning behind the 10^12 instances claim? It is based on something like 20^9 mobile devices, each with 50 apps, each with an SQLite database?
> Since SQLite is used extensively in every smartphone, and there are more than 4.0 billion (4.0e9) smartphones in active use, each holding hundreds of SQLite database files, it is seems likely that there are over one trillion (1e12) SQLite databases in active use.
> SQLite takes backward compatibility very seriously - All releases of SQLite version 3 can read and write database files created by the very first SQLite 3 release (version 3.0.0) going back to 2004-06-18.
This is so remarkable and reminds me of my troubles with MongoDB and specially InfluxDB.
My MongoDBs are mostly still on 4.4 because of the complicated upgrade path (mostly related to the Python drivers), and InfluxDB is now officially split into 1.x and 2.x for me, where I have no plans for upgrading. And I specially will keep my hands off of 3.x because I've learned my lesson.
This has the nice effect that fossil, the source control system built on sqlite, will open repos one hasn't looked at for years without trouble. I suspect lots of software works very much better in practice because they chose sqlite as the storage layer somewhere in the distant past.
If you have constant, multiple, concurrent writes on a non-append-only database, it is bound to perform poorly no matter what database you pick. SQLite in this case nicely points out that you probably have a major architectural issue in your application.
On more productive notes:
* Are you using WAL mode?
* Are you using Batch inserts/updates/upserts?
* Are you using `BEGIN IMMEDIATE` when you need DML? Suddenly upgrading from autocommit mode or `BEGIN DEFERRED` "DQL" transactions to `BEGIN IMMEDIATE` "DML" ones implicitly by suddenly starting DML on what used to be a sequence of DQL queries is bad on any database, but worse on SQLite;
> If you have constant, multiple, concurrent writes on a non-append-only database, it is bound to perform poorly no matter what database you pick.
This is obviously incorrect, since Postgres can handle more than one simultaneous write transaction just fine. The rest of your post is accurate, but this is an intentional design decision to simplify SQLite’s implementation, not some fundamental limitation.
This is wrong, and harmfully wrong. OSI are not the arbiters of open source. Their Open Source Definition, though generally useful and accepted, is not without legitimate criticism and controversy. As for their approval, that’s a dreadful thing to rely on for any purpose; <https://writing.kemitchell.com/2019/05/05/Rely-on-OSI.html> is a good description of most of what’s wrong with it (it doesn’t really get into the broken politics enough; but some of his other articles contain more), and I like its summary: “The list of OSI-approved licenses reflects OSI’s practical and political history, not any useful, consistently functional category of license terms.”
As for whether SQLite is open source, well, the only reason a public domain dedication doesn’t meet the OSD is that it’s not a license. It’s more open. In a way that is legally mildly uncertain in some jurisdictions, sure, but to call it “not open source in the legal sense” is just wrong.
reply