Databricks in talks to acquire startup Neon for about $1B

wqtz · 2025-05-06T05:25:37 1746509137

Databricks acquired bit.io and subsequently shut it down quite fast. Afaik bit.io had a very small team and the founder was a serial entrepreneur who is not going to stick around and he did not. I am not sure who from bit.io is still around at databricks.

If I am guessing right, Motherduck will likely be acquired by GCP because most of the founding team was ex-BQ. Snowflake purchased Modin and polars is still quite immature to be acquisition ready. So, what does this leave us with. There is also EDB who is competing in enterprise Postgres space.

Folks I know in the industry are not very happy with databricks. Databricks themselves was hinting people that that they would be potentially acquired by Azure as Azure tries to compete in the data warehouse space. But everyone become an AI company which left Databricks in an awkward space. Their bdev team is not bestest from my limited interactions with them (lots of starbucks drinkers and let me get back to you after a 3 month PTO), so they do not know who or how to lead them to an AI pivot. With cash to burn from overinvestment and the snowflake/databricks conf coming up fast they needed a big announcement and this is that big announcement.

Should have sobered up before writing this though. But who cares.

mritchie712 · 2025-05-06T11:45:37 1746531937

The "datalake" is becoming a bit of a commodity. It's getting pretty easy to spin one up yourself[0] using completely open source components.

Databricks and Microsoft (thru Fabric) are trying to build a complete data platform, i.e. ELT + datalake + BI

My bet with Definite (https://www.definite.app/) has been this is too hairy for a large company to do well and we can do it better.

0 - https://www.definite.app/blog/cloud-iceberg-duckdb-aws

arccy · 2025-05-06T09:58:30 1746525510

starbucks drinkers is certainly a new way to describe people, though i'm not sure what image that's supposed to invoke

ethbr1 · 2025-05-06T10:59:29 1746529169

From context in parent, I'm reading as the sort of person who looks more competent than they are and skates from job to job quickly enough that no one notices.

joshuanapoli · 2025-05-06T11:25:23 1746530723

Maybe they mean the kind of biz dev that uses small bribes (a free drink at Starbucks) to help get customers to take their call.

tomrod · 2025-05-06T12:16:13 1746533773

Of all the images I imagined, it's not this one.

BDev can be good or bad. Bad ones tend to not follow up, and Starbucks here represents they have poor decision making skills (reinforced by going on PTO for three months and not following up on commitments).

bluecheese452 · 2025-05-06T12:45:05 1746535505

Thought the same. I mean I don’t drink it because I can make my own far cheaper but I don’t look on with scorn at those who do. It says a lot more about the person making the judgment than those who drink the coffee.

ignoreusernames · 2025-05-06T13:53:51 1746539631

> Folks I know in the industry are not very happy with databricks

Yeah, big companies globing up everything does not lead to a healthy ecosystem. Congrats on the founders for their the acquisition but everyone else loses with movements like this.

I'm still sour after their Redash purchase that instantly "killed" the open source version. Tabular acquisition was also a bit controversial since one of the founders is the PMC Chair for Iceberg which "competes" directly with Databricks own delta lake. The mere presence of these giants (mostly databricks and snowflake) makes the whole data ecosystem (both closed and open source) really hostile.

ThePowerOfFuet · 2025-05-06T10:43:28 1746528208

>Should have sobered up before writing this though. But who cares.

In vino veritas, and all that; we appreciate your honesty!

sys13 · 2025-05-06T12:35:36 1746534936

Very unlikely that Databricks would be acquired by Azure. So much of their business is on AWS, and they are invested in by AWS/Azure/GCP.

AlexeyBelov · 2025-05-07T09:12:56 1746609176

Starbucks drinkers? What do you mean?

newfocogi · 2025-05-05T21:18:43 1746479923

They offer serverless Postgres. Here's a link if anyone else needs it https://neon.tech/

gopalv · 2025-05-06T06:30:37 1746513037

An OLTP solution fixes a lot of the headaches about the traditional extract-load-transform steps.

Mostly a lot of OLAP starts when the data loads in Kafka logs or a disk of some sort.

Then you schedule a task or keep a task polling this constantly, which is always prone to small failures & delays or big failures when schema changes up.

The "data pipeline" team exists because the data doesn't move by itself from where it is first stored to where it is ready for deep analysis.

If you can directly push 1-row updates transactionally to a system and feed off the backend to write a more OLAP friendly structure, then you can hookup things like a car rental service's operational logs into a system which can compute more complex things like forecasting of availability or apply discounts to give a customer an upgrade for cheap.

Neon looks a lot better than YugaByte in tech (which also talks postgres protocols) and a lot nicer in protocol compatibility than something like FoundationDB.

Alloy from Google feels somewhat similar, Spanner has a postgres interface too.

The postgres API is a great abstraction common point, even if the actual details of the implementations vary a lot.

betteryet · 2025-05-06T10:19:53 1746526793

Neon is a great product because they are run by Postgres enthusiasts. They have decent customer-friendly pricing, real serverless HTTP endpoints, and they're always on the latest version of Postgres as soon as it is stable. From what I can tell, no other provider has this positioning, driven by dedication.

I really hope they can maintain this dedication after acquisition, but Databricks will probably push them into enterprise and it will lose the spark. I wish Cloudflare bought them instead.

jmull · 2025-05-05T21:46:44 1746481604

Wow, $1B.

I've been bullish on neon for a while -- the idea hits exactly the right spot, IMO, and their execution looks good in my limited experience.

But I mean that from a technical perspective. I never have any real idea about the business -- do they have an edge that makes people want to start paying them money and keep paying them money? Heck if I know.

I guess that's going to be Databricks problem now (maybe).

xyst · 2025-05-05T22:40:51 1746484851

Actual revenue is irrelevant. This is a business decision to corner the market.

blitzar · 2025-05-06T09:40:20 1746524420

No, no no no, no revenue. Why would you go after revenue?

Pre-revenue pure play.

https://www.youtube.com/watch?v=BzAdXyPYKQo

brap · 2025-05-06T11:29:25 1746530965

I'm sorry but what is "the idea"? Managed postgres?

It seems like execution >>> idea in this case

joshstrange · 2025-05-06T17:47:25 1746553645

Neon goes further than just "managed postgres". I would say one of their big features is just how fast and easy you can spin up new db/clusters. It's completely possible (encouraged) to spin up 1 DB per tennant and potentially spin u and tear down 1000's of databases.

It opens up some interesting ideas/concepts when creating an isolated DB is just as easy as creating a new db table.

jmull · 2025-05-06T14:33:55 1746542035

More specifically, the idea is "serverless" posgres.

But as I mentioned, I mean from a tech standpoint... If you're interested, they've posted various things about how the tech works.

> It seems like execution >>> idea in this case

I don't know what >>> means here, so possibly I complete agree or perhaps completely disagree.

__s · 2025-05-06T21:46:41 1746568001

>>> means "way better than"

jordan1212k · 2025-05-08T22:26:00 1746743160

Execution is more valuable than any individual technology. And the moats of the largest companies are only in part technical.

impulser_ · 2025-05-06T10:30:40 1746527440

These Postgres, and serverless databases are all so overhyped. I have tried all of them and they all are much slower than just deploying a managed database in the same datacenter as your application.

I have an application deployed on Railway with a Postgres database and the user's latency is consistent 150ms. The same application deployed on these serverless/edge provider is anywhere between 300-400ms with random spikes to 800ms. The same application, same data, and same query.

The edge and serverless has to be the biggest scam in cloud industry right now.

They aren't faster, and they aren't cheaper. You could argue they are easier to scale, but that not he case anymore since everyone provides autoscaling now.

cpursley · 2025-05-06T12:00:35 1746532835

Whatever. I was able to set up Neon Postgres in 5 mins. It’s still crazy fast with my Fly services, has replication out of the box and backups. Much easier than AWS and from what I can tell, getting something going with Railway. And I don’t have to worry about operating it. My time is valuable.

mbreese · 2025-05-06T12:29:33 1746534573

All of that can be true. What I wonder is — if that all is true — how much of a moat is there around that? It seems like the secret sauce in that company isn’t some custom technology, it’s execution. Execution can be replicated by another competent team. Or is there some other secret sauce that I can’t see?

vladich · 2025-05-07T08:30:27 1746606627

It's the team, they have a few Postgres committers and major contributors, and there are not that many of them. But that's a bit precarious, the team may leave after the acquisition for many reasons.

datadrivenangel · 2025-05-06T12:52:50 1746535970

Execution is some of the hardest secret sauce of all

mbreese · 2025-05-06T14:32:06 1746541926

I completely agree... in my comment, the word "competent" was doing a lot of heavy lifting.

And it begs comparisons to comments about Dropbox/rsync, etc...

But, I personally think the Neon concept of branching databases with CoW storage is quite interesting. That, combined with cost-management with autoscaling does seem like at least a serviceable moat.

impulser_ · 2025-05-06T12:33:11 1746534791

These are features of any managed database service.

DigitalOcean, Railway, Render, and so on all offer the exact same feature except it's just pure Postgres and you can deploy them in the same data center as your application.

cpursley · 2025-05-06T13:03:42 1746536622

Render nor DO offer logical replication and are missing some other features.

myflash13 · 2025-05-06T13:15:04 1746537304

400ms added latency is really bad for user experience. Do a few queries and you’re going to need to add caching. Now you’re spending your precious developer time managing caching invalidation in lots of places instead of just setting up your database properly in the beginning.

cpursley · 2025-05-06T16:36:18 1746549378

Except it's not between neon and fly:

https://neon.tech/blog/how-to-minimise-the-impact-of-databas...

https://neon.tech/demos/regional-latency

myflash13 · 2025-05-07T06:51:52 1746600712

I understand there are ways to deal with the problem of latency in serverless, but this is a problem I'd rather not deal with in the first place. The database IS the application, and I would not want to sacrifice speed of the database for anything. Serverless is totally not worth the trade-off for me: slightly more convenient deployments, for much higher latency to the database.

I'm a solo dev that has been installing and running my own database server with backups for decades and have never had a problem with it. It's so simple, and I have no idea why people are so allergic to managing their own server. 99% of apps can run very snappily on a single server, and the simplicity is a breath of fresh air.

pdimitar · 2025-05-07T16:33:44 1746635624

That's why I'm working hard on bringing in a tightly integrated support for SQLite in the Elixir ecosystem (via a Rust FFI bridge): because in my professional experience not many applications need something as hardcore and amazing as PostgreSQL; at least 80% of all apps I ever witnessed would be just fine with an embedded database.

I share similar experiences like yours and others in this thread, and to me all those operational concerns grow into unnecessary noise that distracts from the real problems that we are paid to solve.

tristan957 · 2025-05-06T22:21:10 1746570070

Are you referring to cold start latencies?

myflash13 · 2025-05-07T06:54:49 1746600889

Not just cold start (another problem you have to worry about with serverless). There's the simple fact that network latency outside of the same datacenter is ALWAYS slow and randomly unpredictable, especially if you have to run multiple queries just to render a single page to your user. A database should always be over LAN in my opinion, if you need to access data over the internet, at that point it should be over an API/HTTP, not internal database access.

mritchie712 · 2025-05-06T11:33:30 1746531210

supabase lured me in with built-in oauth, real-time, and some nice client side features in their JS lib, but I do worry about the latency sometimes.

It'd be a lot of work to run an apples to apples test with a Google Cloud Postgres db vs. Supabase and see what the difference is.

atombender · 2025-05-07T10:32:16 1746613936

Isn't this an apples-to-orange comparison?

Neon's multi-region support isn't directly comparable to a single Postgres database in a single data center. You can set up Neon in a single data center, too, and I would expect the same performance in that case.

Meanwhile, if you tried to scale your single-Postgres to a multi-region setup, you'd expect higher latencies relative to the location of your data.

myflash13 · 2025-05-06T13:13:20 1746537200

Even managed databases are a scam. You can easily get 10x cheaper pricing for the same workload, by, wait for it, installing Postgres yourself on a baremetal machine. Plus you get much better performance, no noisy neighbors, and ability to actually control and measure low level performance. I never got the hype for serverless. Why are people so allergic to setting up a server? It takes a few hours a year of investment, and the performance benefits are huge.

tristan957 · 2025-05-06T22:23:45 1746570225

> Even managed databases are a scam

Just because you don't derive value out of something doesn't mean it is a scam.

forgetfulness · 2025-05-05T23:40:20 1746488420

What is the lowdown on Databricks? Their bread and butter were hosted Spark and notebooks. As tasks done in Spark over a data lake began to be delegated wholesale to columnar store ELT, they tried to pivot to "lake houses", then I sort of lost track of them after I got out of Spark myself.

Did Delta Lake ever catch on? Where are they going now?

richardw · 2025-05-06T00:38:37 1746491917

Capture enterprise AI enthusiasm by providing a 1-stop shop for data and AI, optionally hosted on your own cloud tenant. Keep deploying functionality so clients never need another supplier. Partner with SAP, OpenAI, anyone who holds market share. Buy anyone that either helps growth or might help a competitor grow.

Enterprise view: delegate AI environment to Databricks unless you’re a real player. Market is too chaotic, so rely on them to keep your innovation pipeline fed. Focus on building your own core data and AI within their environment. Nobody got fired for choosing Databricks.

jimbokun · 2025-05-06T01:28:44 1746494924

Can someone translate this to non-CEO speak?

baggiponte · 2025-05-06T01:59:12 1746496752

You basically pay databricks a “fee” to choose the more appropriate and modern stack for you to build on, and keep it up to date. Never used it, but it handles with lots of the administrative bs (compliance, SLAs, idk) for you so you can just ship.

forgetfulness · 2025-05-06T03:06:11 1746500771

That does sound, as you allude, like IBM on its long downward spiral of globbing up products to stay relevant and touting them as an integral solution, while in-house development stuck to keeping legacy products alive for their Enterprise contracts. I wonder if they'll be foolish enough to start doing consulting around them, obliterating their economies of scale in the process; so far they are going with the "consulting partners" approach.

Oh well. Databricks notebooks were hella cool back when companies were willing to spend lavishly on having engineers write cloud hosted Scala in the first place, and at premium prices to boot.

cactusfrog · 2025-05-06T05:10:21 1746508221

A nice UI for a data lake house is underrated. I use AWS Athena at my work and it is just so bad for no good reason. For example, big columns of text are expanded outwards making reading the subsequent columns impossible.

senderista · 2025-05-06T14:51:43 1746543103

Well UI has never exactly been Amazon's strong suit.

mritchie712 · 2025-05-06T11:38:03 1746531483

Delta Lake is not catching on, but no worries, they bought Iceberg[0] (the competing standard).

I'm joking, but only a bit. Iceberg is open source (Apache), but a lot of the core team and the creator worked at Tabular and Databricks bought them for $1B.

0 - https://www.definite.app/blog/databricks-tabular-acquisition

rogermavis · 2025-05-06T03:56:32 1746503792

It provides central place to store and query data. A big org might have a few hundred databases for various purposes - databricks lets data engineers set up pipelines to ETL that data into databricks and when the data is there it can be queried (using spark, so there's some downsides - namely a more restrictive SQL variant - but some advantages - better performance across very large datasets).

Personally, I hated databricks, it caused endless pain. Our org has less than 10TB of data and so it's overkill. Good ol' Postgres or SQL Server does just fine on tables of a few hundred GB, and bigquery chomps up 1TB+ without breaking a sweat.

Everything in databricks - everything - is clunky and slow. Booting up clusters can take 15 minutes whereas something like bigquery is essentially on-demand and instant. Data ETL'd into databricks usually differs slightly from its original source in subtle but annoying ways. Your IDE (which looks like jupyter notebook, but is not) absolutely suck (limited/unfamiliar keyboard shortcuts, flakey, can only be edited in browser), and you're out of luck if you want to use your favorite IDE, vim etc.

Almost every databricks feature makes huge concessions on the functionality you'd get if you just used that feature outside of databricks. For example databricks has it's own git-like functionality (which is the 5% of git that gets most used, but no way to do the less common git operations).

My personal take is databricks is fine for users who'd otherwise use their laptop's computer/memory - this gets them an environment where they can access much more, at about 10x the cost of what you'd pay for the underlying infra if you just set it up yourself. Ironically, all the databricks-specific cruft (config files, click ops) that's required to get going will probably be difficult for that kind of user anyway, so it negates its value.

For more advanced users (i.e. those that know how to start an ec2 or anything more advanced), databricks will slow you down and be endlessly frustrating. It will basically 2-10x the time it takes to do anything, and sap the joy out of it. I almost quit my job of 12 years because the org moved to databricks. I got permission to use better, faster, cheaper, less clunky, open-source tooling, so I stayed.

bokenator · 2025-05-06T04:14:40 1746504880

Which open source option did you end up going with? I'm in the same boat and would like to evaluate my options.

rogermavis · 2025-05-06T04:39:40 1746506380

My stack atm is neovim, python/R, an EC2 and postgres (sometimes Sql Server). Some use of arrow and duckdb. For queries on less than few hundred GB this stack does great. Fast, familiar, the ec2 is running 24/7 so it's there when I need it and can easily schedule overnight jobs, and no time wasted waiting for it to boot.

creeksai · 2025-05-06T04:45:08 1746506708

You mentioned earlier about how long it would take to acquire a new cluster in Databricks, but you are comparing it here to something that's always on here. In a much larger environment, your setup is not really practical to have a lot of people collaborating.

Note that Databricks SQL Serverless these days can be provisioned in a few seconds.

rogermavis · 2025-05-06T04:57:29 1746507449

> you are comparing it here to something that's always on

That's the point. Our org was told databricks would solve problems we just didn't have. Serverful has some wonderful advantages: simplicity, (ironically) cheaper (than something running just 3-4 hours a day but which costs 10x), familiarity, reliability. Serverless also has advantages, but only if it runs smoothly, doesn't take an eternity to boot, isn't prohibitively expensive, and has little friction before using it - databricks meets 0/4 of those critera, with the additional downside of restrictive SQL due to spark backend, adding unnecessary refactoring/complexity to queries.

> your setup is not really practical to have a lot of people collaborating

Hard disagree. Our methods are simple and time-tested. We use git to share code (100x improvement on databricks' version of git). We share data in a few ways, the most common are by creating a table in a database or in S3. It doesn't have to be a whole lot more complicated.

creeksai · 2025-05-06T05:06:59 1746508019

I totally understand if Databricks doesn't fit your use cases.

But you are doing a disingenuous comparison here because one can keep a "serverful" cluster up without shutting it down, and in that case, you'd never need to wait for anything to boot up. If you shut down your EC2 instances, it will also take time to boot up. Alternatively, you can use the (relatively new) serverless offering from them that gets you compute resources in seconds.

rogermavis · 2025-05-06T05:24:24 1746509064

To ensure I'm not speaking incorrectly (as I was going from memory), I grep'ed my several years' of databricks notes. Oh boy.. the memories came flooding back!

We had 8 data engineers onboarding the org to databricks, it was only after 2 solid years before they got to working on serverless (it was because users complained of user unfriendliness of 'nodes', and managers of cost). But then, there were problems. A common pattern through my grep of slack convos is "I'm having this esoteric error where X doesn't work on serverless databricks, can you help".. a bunch of back and forth (sometimes over days) and screenshots followed by "oh, unfortunately, serverless doesn't support X".

Another interesting note is someone compared serverless databricks to bigquery, and bigquery was 3x faster without the databricks-specific cruft (all bigquery needs is an authenticated user and a sql query).

Databricks isn't useless. It's just a swiss army knife that doesn't do anything well, except sales, and may improve the workflows for the least advanced data analysts/scientists at the expense of everyone else.

datadrivenangel · 2025-05-06T12:57:58 1746536278

This matches my experiences as well. Databricks is great if 1. your data is actually big (processing 10s/100s of terabytes daily), and 2. you don't care about money.

thr0w · 2025-05-06T13:04:23 1746536663

> Fast > ec2

Are you doing this on EBS? Honest question.

walamaking · 2025-05-06T05:59:15 1746511155

Dumb question - how is this different from Snowflake?

pm90 · 2025-05-06T07:18:06 1746515886

they are competitors and are similar. Snowflake popularized the cloud datawarehouse concept (after aws fumbled it big with Redshift). DB is the hot new tool.

levanten · 2025-05-06T06:17:34 1746512254

They are very similar; with various similar solutions at differing stages of maturity.

ajma · 2025-05-06T16:57:28 1746550648

when you got out of Spark, what did you go to?

forgetfulness · 2025-05-06T20:00:28 1746561628

BigQuery ELT, the org I went to was rather immature in their data practice, and I sold them on getting some proper orchestration (Dataform, their preference over DBT, and Airflow), and keeping the architecture coherent.

I'd have rather stuck with Spark just because I prefer Scala or Python to SQL (and that comes with e.g. being far easier to unit test), but life happened and that ecosystem was getting disrupted anyway.

datadrivenangel · 2025-05-05T20:42:03 1746477723

Databricks is trying hard to get into serverless, but it seems like they refuse to allow it to actually be cheaper, which defeats the purpose of serverless.

thrance · 2025-05-05T21:16:26 1746479786

I don't think being cheaper is the main value sell of serverless. When I hear "serverless" I think "ease of deployment and automatic scaling".

whstl · 2025-05-06T10:02:50 1746525770

Serverless is incredibly cheap for endpoints that don't get called too often, and incredibly expensive for endpoints that are.

I guess different people just have different experiences.

whateveracct · 2025-05-05T21:41:09 1746481269

Right but ultimately that's a cost thing, right? Because you can solve those problems through other means and by hiring internally.

Serverless is meant to obviate some of that. But it is less compelling when the vendor tries to gobble up that margin for themselves.

sitkack · 2025-05-05T22:42:33 1746484953

You will all forced to go serverless because new grads can't use the command line. Running a database is about the hardest thing you can do. If it is serverless, you don't need special skills, preventing employees from becoming valuable lowers costs across the board.

vhcr · 2025-05-06T04:11:25 1746504685

Have you tried being less jaded? Running a database is NOT about the hardest thing you can do.

sitkack · 2025-05-06T16:43:46 1746549826

When running a service, databases are the hardest to run. K8S still doesn't handle them well (this is by design), so they are the first thing to get outsourced to a managed service.

This is me being less jaded. Support those little wins!

viccis · 2025-05-06T02:31:43 1746498703

There are so many gotchas. I'm getting so tired of working around it, but my company is all in on serverless so the pain will continue. A lot of it is tied up with Unity Catalog shortcomings, but Serverless and UC are basically joined at the hip.

A few just off the top of my head:

* You can't .persist() DataFrames in serverless. Some of my work involves long pipelines that wind up with relatively small DFs at the end of them, but need to do several things with that DF. Nowhere near as easy as just caching it. * Handling object storage mounted to Unity Catalog can be a nightmare. If you want to support multiple types of Databricks platforms (AWS, Azure, Google, etc.), then you will have to deal with the fact that you can't mount one type's object storage with another. If you're on Azure Databricks, you can't access S3 via Unity Catalog. * There's no API to get metrics like how much memory or CPU was consumed for a given job. If you want to handle monitoring and alerting on it yourself, you're out of luck. * For some types of Serverless compute, startup times from cold can be 1 minute or more.

They're getting better, but Databricks is an endless progression of unpleasant surprises and being told "oh no you can't do it that way", especially compared to Snowflake, whose business Databricks has been working to chew away at for a while. Their Variant type is a great example. It's so much more limited than Snowflake's that I'm still learning new and arbitrary ways in which it's incompatible with Snowflake's implementation.

programmertote · 2025-05-05T22:43:56 1746485036

I had an interview with a senior data engineering candidate and we were talking about how expensive Databricks can get. :D I set up specific budget alerts in Azure just for Databricks resources in DEV and PROD environments.

avg_dev · 2025-05-06T03:04:57 1746500697

hmm, what is a serverless Pg? I don't quite understand. I thought you needed a database server if you wanted to run Pg.

mohon · 2025-05-06T03:12:40 1746501160

basically they separate the compute and storage into different components, where the traditional PG use both compute and storage at the same server.

because of this separation, the compute (e.q SQL parsing, etc) can be scaled independently and the storage can also do the same, which for example use AWS S3

so if your SQL query is CPU heavy, then Neon can just add more "compute" nodes while the "storage" cluster remain the same

to me, this is similar to what the usual microservice where you have a API service and DB. the difference is Neon is purposely running DB on top of that structure

fock · 2025-05-06T05:45:47 1746510347

So how is this distributed Postgres still an ACID-compliant database? If you allow multiple nodes to query the same data this likely is just Trino/an OLAP-tool using Postgres syntax? Or did they rebuild Postgres and not upstream anything?

mohon · 2025-05-06T13:02:22 1746536542

They keep using the core Postgre while they touch the storage layer to works with S3. Can try ro read more here https://jack-vanlightly.com/analyses/2023/11/15/neon-serverl...

fock · 2025-05-08T06:00:57 1746684057

Thank you, very nice read! (Though from some scanning it looks like it mostly helps reads)

kwillets · 2025-05-06T15:05:46 1746543946

It's only serverless in the way it commits transactions to cloud storage, making the server instance ephemeral; otherwise it has a server process with compute and in-memory buffer pool almost identical to pg, with the same overheads.

LtWorf · 2025-05-06T04:26:58 1746505618

Marketing speech.

udev4096 · 2025-05-06T04:37:03 1746506223

You shouldn't be getting downvoted. Serverless is nothing more than a hype which is meant to overcharge you instead of running it on a server owned by you

vasco · 2025-05-06T05:01:57 1746507717

That's a reductionist view of a technical aspect because of the way the technical aspect is sold. Serverless are VMs that launch and turn off extremely quickly, so much so that they open up new ways of using said compute.

You can deploy serverless technologies in a self hosted setup and not get "overcharged". Is a system thread bullshit marketing over a system process?

thiagoeh · 2025-05-05T23:17:45 1746487065

Looks like the acquihire of Bit.io in 2023 wasn't enough to be able to deliver their own OLTP offering

https://blog.bit.io/whats-next-for-bit-io-joining-databricks... https://www.databricks.com/blog/welcoming-bit-io-databricks-...

Or it's just a business decision to corner the market, as someone else said

timenova · 2025-05-06T03:06:56 1746500816

Okay now I am concerned. We're using Neon. We can move easily at this point, but I'm sure they have huge customers storing many terabytes of data where this may be genuinely hard to do.

I went to Archive.org and figured out that in 2023, they announced they were shutting down on May 30th, all databases shutdown on June 30th, only available for downloads after that, and deleted on July 30th.

joshstrange · 2025-05-06T17:54:50 1746554090

Same boat here. Not really looking to have to move but I'm incredibly thankful that I never integrated with Neon more than using Postgres. I don't depend on/need their API or other branching features.

I hate that this is what I've become, I want to try some of the cool features "postgres++" providers offer but I actively avoid most features fearing the potential future migration. I got burned using the Data API on Aurora Serverless and then leaving them and having to rewrite a bunch of code.

klabb3 · 2025-05-06T00:18:31 1746490711

They aren’t exactly hiding it. I kept my eye on bit.io because they looked very promising. Next day, gone. Shut down immediately. Something is fucky with the investment pipeline because it’s not ”worth” that much on its own, it’s a market dominance play, bad for innovation..

mcmcmc · 2025-05-06T00:07:18 1746490038

> Or it's just a business decision to corner the market, as someone else said

Given how lax antitrust enforcement is, probably this

esadek · 2025-05-06T10:18:26 1746526706

I migrated to Neon from bit.io after Databricks acquired and sunset it. Really hope I won't have to migrate again.

clpm4j · 2025-05-05T21:43:10 1746481390

I've been seriously considering neon for a new application. This definitely gives me pause... maybe plain ol' Postgres is going to be the winner for me again.

jedberg · 2025-05-05T21:51:31 1746481891

Why would this give you pause? You just don't want the data to be where Databricks is?

Either way, there are plenty of other serverless Postgres options out there, Supabase being one of the most popular.

MOARDONGZPLZ · 2025-05-05T22:09:26 1746482966

Can’t speak for anyone but myself and my experience anecdotally, having used Databricks: I consider them to be the Oracle of the modern era. Under no circumstances would I let them get their hooks into any company I have the power from preventing it.

clpm4j · 2025-05-06T00:16:05 1746490565

This is exactly how I feel. I do not want to be in the Databricks ecosystem.

thor24 · 2025-05-05T22:17:58 1746483478

Why do think so? Databricks notebook product I have used in couple of companies is pretty solid. I have done any google research but they are generally known to be very high talent dense kind of place to work.

sitkack · 2025-05-05T22:39:36 1746484776

You and the parent are not talking about the same things.

omneity · 2025-05-05T22:56:10 1746485770

Supabase, while a great product, does not offer serverless Postgress.

jedberg · 2025-05-05T23:00:01 1746486001

What would you say they offer then if not serverless Postgres?

You set up a database, you connect to it, they take care of the rest. It even scales to $0 if you don't use it.

Is that not serverless Postgres?

omneity · 2025-05-06T01:18:03 1746494283

Serverless in the context of Postgres means to decouple storage and compute, so you could scale compute "infinitely" without setting up replica servers. This is what Neon offers, where you can just keep hitting their endpoints with your pg client and it should just take whatever load (in principle) and bill you per request.

Supabase gives you a server that runs classic Postgres in a process. Scaling in this scenario means you increase your server's capacity, with a potential downtime while the upgrade is happening.

You are confusing _managed_ Postgres for _serverless_.

Others in the serverless Postgres space:

- https://www.orioledb.com/ (pg extension)

- https://www.thenile.dev/ (pg "distribution")

- https://www.yugabyte.com/ (not emphasizing serverless but their architecture would allow for it)

daniel_levine · 2025-05-06T03:30:59 1746502259

https://supabase.com/blog/supabase-acquires-oriole

omneity · 2025-05-06T08:55:30 1746521730

Interesting. Maybe a new product line will come out of this.

edoceo · 2025-05-06T00:39:18 1746491958

That's postgre on their server.

jedberg · 2025-05-06T00:42:37 1746492157

Yes, serverless doesn’t mean no servers.

How is what Supabase offers different from what Neon offers from a user perspective?

anilgulecha · 2025-05-06T04:00:56 1746504056

Exactly how EC2 is different from Lambda from a user's perspective.

greenavocado · 2025-05-05T23:09:52 1746486592

> Why would this give you pause?

After a funding round the value extraction from customers is just over the horizon

mdaniel · 2025-05-06T01:21:25 1746494485

Lucky you, you still can as it's Apache 2 https://github.com/neondatabase/neon/blob/release-8516/LICEN...

I haven't studied the CLA situation in order to know if a rug pull is on the table but Tofu and Valkey have shown that where there's a will there's a way

ddorian43 · 2025-05-06T07:58:20 1746518300

It's open source like a code dump. There's no support for open source IIRC.

mdaniel · 2025-05-06T15:31:27 1746545487

I can't easily add &exclude_maintainers=true but https://github.com/neondatabase/neon/pulls?q=is%3Apr+is%3Acl... sure does look like quite a bit of merged contributions to me, which is not what I would consider "code dump"

senderista · 2025-05-06T14:58:44 1746543524

The whole point of a serverless platform is that it's hosted infrastructure. Open source doesn't mean it's feasible to run it yourself.

mdaniel · 2025-05-06T15:28:04 1746545284

The whole point to you, but the whole point to me was having scale-to-zero because Aurora Serverless hurp-durp-ed on that. And I deeply enjoy the ability to fix bugs instead of contacting AWS Support with my hat in my hand asking to be put on some corporate backlog for 2073

Thankfully, you can continue to pay Databricks whatever they ask for the privilege of them hosting it for you

senderista · 2025-05-06T15:45:56 1746546356

Aurora Serverless v2 now scales to zero[1]. And DSQL does pretty much by definition (they use an architecture closer to Neon).

[1] https://aws.amazon.com/blogs/database/introducing-scaling-to...

vibhork · 2025-05-05T22:16:05 1746483365

Try Supabase!

yalogin · 2025-05-05T23:22:02 1746487322

A tangential question here, will Databricks ever go public? At this point it's a large company making billion dollar acquisitions.

For someone looking to join the company, I cannot imagine IPO to be a motivation anymore.

manquer · 2025-05-05T23:37:20 1746488240

Later stage things are , the potential IPO is a benefit not deterrent. Recruiters and hiring managers will hint at potential IPO being not far off as an incentive to join. It minimizes risk, they do same for potential target’s founders like Neon here .

This is better than earlier stage startups , while you get far better multiples , it is also quite possible that you are let go somewhere into the cycle without the money to vest the options for tax reasons and there is short vesting period on exit.

For this reason companies these days offer 5/10 yr post leaving as a more favorable offer

——

For founders it is gives them a shorter window to a exit than on their own, and in revenue light and tech heavy startup like neon (compared to databricks) the value risk is reduced because stock they get in acquisition is based on real revenue and growth not early stage product traction as neon would be today .

They also have some cash component which is usually enough to buy core things in most founders look at like buying a house in few million range or closing mortgages or invest in few early stage projects directly or through funds

ww520 · 2025-05-05T23:53:27 1746489207

If they are making money, there is no pressure to raise money from IPO.

VirusNewbie · 2025-05-05T23:35:58 1746488158

Why does it matter if you get liquidity events 2-4x per year

kyawzazaw · 2025-05-05T23:23:48 1746487428

they can do employee liquidity event

yalogin · 2025-05-05T23:27:54 1746487674

That is not the same as an IPO right?

manquer · 2025-05-05T23:33:56 1746488036

No, basically it is a buy back of employee options and stock .

Many companies raise money only to give liquidity to founders / employees and some early investors even if they don’t money for operations at all.

While Databricks is large , there are much bigger companies which would have IPOed at smaller sizes in the past which are delaying (may never do) today. Stripe and SpaceX are the biggest examples both have healthy positive cash flows but don’t feel the value of going public . Buying back shares and options is the only route if you don’t have IPO plans if you want to keep early stage employees happy

hgontijo · 2025-05-05T23:36:14 1746488174

Company offers to purchase employee pre-ipo shares.

markus_zhang · 2025-05-05T23:20:47 1746487247

I'm confused. I saw users left Databricks left and right. Two companies I worked for previously got out of it due to cost.

Do they still have a lot of $$$?

hgontijo · 2025-05-05T23:37:39 1746488259

https://www.databricks.com/company/newsroom/press-releases/d...

markus_zhang · 2025-05-05T23:58:21 1746489501

Thanks. OK they still have a lot of money.

joshstrange · 2025-05-05T23:09:47 1746486587

Well this isn't great news. I quite enjoy using Neon but I doubt it's going to continue to cater to people like me if it's bought by Databricks (from the little I know about them and from looking at their website).

Thankfully, I just need "Postgres", I wasn't depending on any other features so I can migrate easily if things start going south.

matt-p · 2025-05-08T13:27:32 1746710852

Neon is a interesting product and they've got some great Postgres engineers. Having said that 1 Second cold starts are still quite painful for a website/web app.

I hope the $19 plans are there to stay - but I somewhat doubt it.

clarkbw · 2025-05-08T16:42:35 1746722555

cold starts are 500ms on average, and that's only for the first call that wakes up the db from hibernation. people still seem to think that this latency happens for every call (see other threads here) but once the service has woken up (cold start over) you're back to regular (sub 10ms) latency timings and the service continues to run that way. you'll only hit a cold start again if (you have this option turned on) your service goes idle for > 5 min. You can turn scale-to-zero off and you'll run 24/7, have zero cold starts.

$19 plan is going away, will launch a better $5 plan soon.

matt-p · 2025-05-09T13:38:08 1746797888

I use neon quite a bit, profiling seems to show ~600-980ms of extra latency. This is in the AWS London region, on postgres 15/16.

Regardless if I've got a website that's used a couple of times a hour every hour then the practical reality is almost all users have a extra second of latency or so.

I'm not complaining, it's a great product that I'll continue to use, but it's the biggest pain point.

vladich · 2025-05-09T03:40:50 1746762050

Databricks previously invested in Neon through Databricks Ventures btw.

beoberha · 2025-05-06T01:14:27 1746494067

Congrats to the Neon team - they make an awesome product. That’s about all the good I can say here. I don’t blame them for selling out. It’s always felt like a “when” not an “if”. I would be surprised if you can make money selling cloud databases - especially when funded by VCs.

999900000999 · 2025-05-05T23:46:50 1746488810

Supabase just raised 200 million.

What’s with all these Postgres hosting services being worth so much now?

Someone at AWS probably thought about this, easy to provision serverless Postgres, and they just didn’t build it.

I’m still looking for something that can generate types and spit it out in a solid sdk.

It’s amazing this isn’t a solved problem. A long long time ago, I was apart of a team trying to sort this out. I’m tempted to hit up my old CEO and ask him what he thinks.

The company is long gone…

If anything we tried to do way too much with a fraction of the funding.

In a hypothetical almost movie like situation I wouldn’t hesitate to rejoin my old colleagues.

The issue then, as is today is applications need backends. But building backends is boring, tedious and difficult.

Maybe a NoSql DB that “understands” the Postgres API?

investa · 2025-05-05T23:55:59 1746489359

Building backends is easy. It is sort of weird. In 2003 no one would bat an eyelid at building an entire app and chucking it on a server. I guess front-end complexity had made that a specialism so with all that dev energy drained they have no time for the backend. The backend is substantial easier though!

These high value startups timed well to capture the vibe coding (was known as builidng an MVP before), front end culture and sheer volume of internet use and developers.

999900000999 · 2025-05-06T00:05:14 1746489914

It’s harder than signing up for Firebase.

You have to understand a separate set of concerns. Spin something up on ec2, hook it into a db, configure https , figure out why it went down, etc.

You’re right though, once I build a complex front end I want someone else to do the backend.

jimbokun · 2025-05-06T01:39:08 1746495548

You need all that stuff when you need to scale. For an MVP you can get away with very little.

999900000999 · 2025-05-08T18:41:12 1746729672

It depends on how you deploy it.

Django on Render( and presumably a heroku) just works.

It's still much more work that just dropping in a Firebase url. Firebase can lead to poor design choices and come back to bite you, but hopefully by then you've already raised a few VC rounds and you're rolling in dough.

zmj · 2025-05-06T11:35:47 1746531347

> Someone at AWS probably thought about this, easy to provision serverless Postgres, and they just didn’t build it.

AWS is working on this as well: https://aws.amazon.com/blogs/database/introducing-amazon-aur...

senderista · 2025-05-06T15:01:41 1746543701

DSQL is genuinely serverless (much more so than "Aurora Serverless"), but it's a very long way from vanilla Postgres. Think of it more like a SQL version of DynamoDB.

cpursley · 2025-05-06T12:12:20 1746533540

Supabase is not just a hosted Postgres, it’s a full(ish) backend stack built on open source components comparable with something like firebase. But being Postgres, encourages same data modeling (and an escape hatch). Their type generation and SDK is quite good, too. It’s one of my favorite services and powers to projects of mine, soon to be 3.

999900000999 · 2025-05-06T18:42:04 1746556924

I've tried Superbase.

Their choice of Deno for edge functions is... Well, unique.

For my current project I have to do a lot of quirky logic, and I kept hitting a brick wall with Supabase.

I also didn't enjoy the self hosting journey. Not exactly easy.

cpursley · 2025-05-06T23:00:40 1746572440

Haven't used their edge functions yet. What's the issue with Deno (I'm not familier with it)?

For the other stuff, what do you find quirky?

999900000999 · 2025-05-07T15:35:07 1746632107

Firebase let's you write functions in normal node js and Python.

Supabase only supports Deno. The quirkiness is my own server side logic. Tbf, I've tried to build this project at least 4 times and I might need to take a step back.

cpursley · 2025-05-09T09:31:18 1746783078

You aren’t forced to use their serverless runtime - you can pull their js lib into anything and sent events in several ways. There’s even Python libs.

cpursley · 2025-05-06T15:49:37 1746546577

*encourages sane modeling. I can’t type today.

_bohm · 2025-05-05T23:55:15 1746489315

"Easy to provision" is mostly a strategic feature for acquiring new users/customers. The more difficult parts of building a database platform are reliability and performance, and it can take a long time to establish a reputation for having these qualities. There's a reason why most large enterprises stick to the hyperscalers for their mission-critical workloads.

investa · 2025-05-05T23:58:46 1746489526

That reason also includes SOC2, FedRAMP, data at rest jurisdiction, availability zones etc. And if large enough you can negotiate the standard pricing.

_bohm · 2025-05-06T00:02:45 1746489765

For sure. And oftentimes these less sexy features or certifications are much more cumbersome to implement/acquire than the flashy stuff these startups lead with

jimbokun · 2025-05-06T01:37:28 1746495448

> Maybe a NoSql DB that “understands” the Postgres API?

I believe there are several of these already, like Cockroach DB.

zamderax · 2025-05-05T23:56:25 1746489385

Supabase is particularly valuable for its users. Or right now “vibecoders”

briandear · 2025-05-06T09:54:08 1746525248

Neon is awesome. I hope Databricks doesn’t brick it.

AbstractH24 · 2025-05-07T00:50:41 1746579041

Anyone notice a rapid ramp up in acquisitions?

As though folks are looking for exits but IPO isn’t an option.

Think we’re approaching a reckoning for lots of companies that raised circa 2021 at valuations that are no longer plausible and AI startups.

Oh, and ones in the first group that tried to rebrand as the second…

crowcroft · 2025-05-06T03:27:08 1746502028

If I'm guessing this either:

1. An acquihire (if your a Neon customer this would probably be a bad outcome for you).

2. A growth play. Neon will be positioned as an 'application layer' product offered cheap to bring SaaS startups into the ecosystem. As those growth startups grow and need more services sell them everything else.

aurareturn · 2025-05-06T04:33:27 1746506007

Who pays $1b for an acquihire?

crowcroft · 2025-05-07T16:23:04 1746634984

Character AI is the only one I can think of. Although point taken, there must be more going on than a pure acquihire.

chachra · 2025-05-05T23:43:18 1746488598

Hope they don't increase the price!!

kelnos · 2025-05-06T03:58:41 1746503921

I'd be more worried that they'd shut it down...

taw1285 · 2025-05-05T23:47:43 1746488863

I am fairly new to all this data pipeline services (Databricks, Snowflakes etc).

Say right now I have an e-commerce site with 20K MAU. All metrics are going to Amplitude and we can use that to see DAU, retention, and purchase volume. At what point in my startup lifecycle do we need to enlist the services?

speakfreely · 2025-05-06T00:04:39 1746489879

A non-trivial portion of my consulting work over the past 10 years has been working on data pipelines at various big corporations that move absurdly small amounts of data around using big data tools like spark. I would not worry about purchasing services from Databricks, but I would definitely try to poach their sales people if you can.

lizard · 2025-05-06T02:13:40 1746497620

Just curious, what would you consider, "absurdly small amounts of data around using big data tools like spark" and what do you recommend instead?

I recently worked on some data pipelines with Databricks notebooks ala Azure Fabric. I'm currently using ~30% of our capacity and starting to get pushback to run things less frequently to reduce the load.

I'm not convinced I actually need Fabric here, but the value for me has been its the first time the company has been able to provision a platform that can handle the data at all. I have a small portion of it running into a datbase as well which has been constant complaints about volume.

At this point I can't tell if we just have unrealistic expectations about the costs of having this data that everyone wants, or if our data engineers are just completely out of touch with the current state of the industry, so Fabric is just the cost we have to pay to keep up.

speakfreely · 2025-05-06T22:31:58 1746570718

One financial services company has hundreds of Glue jobs that are using pyspark to read and write less than 4GB of data per run. These jobs run every day.

emmelaich · 2025-05-06T01:48:06 1746496086

I'm aware of a govt agency with a few hundred gb of data using Mongo, Databricks and were being pushed towards Snowflake as well. Boggles the mind.

spratzt · 2025-05-06T07:39:23 1746517163

I used to do similar work. Back in the day I used 25 TB as the cut off point for single node design. It’s certainly larger now.

jimbokun · 2025-05-06T01:43:40 1746495820

Which is also a reason to not use Databricks, as they will cost your company money by selling gullible users things they don’t need.

ashvardanian · 2025-05-06T06:08:44 1746511724

Of all the billion-scale investment and acquisition news of the last 24 hours this is the only one that makes sense. Especially after the record-breaking $15B round, that Databricks closed last year.

senderista · 2025-05-06T16:00:21 1746547221

AWS just breathed a huge sigh of relief at the neutralization of Aurora's most dangerous competitor.

anshumankmr · 2025-05-06T02:58:17 1746500297

Great.As someone using Neon, how might this impact me? Price bumps?

joshstrange · 2025-05-06T17:56:34 1746554194

I'd be most concerned with Neon being shut down. That's what Databricks did to bit.io (another serverless Postgres provider they bought).

I'm really not looking forward to a migration.

User23 · 2025-05-06T00:48:43 1746492523

Meanwhile here I am wondering why everyone isn’t using SQLite.

jimbokun · 2025-05-06T01:46:31 1746495991

If you can serve all your traffic by a single instance running Sqlite in same process as your application, have at it.

If you need to serve your dats across a network to many clients, managing that with SQLite is much trickier.

HWR_14 · 2025-05-06T01:14:00 1746494040

I thought SQLite's use case was for a single-user local database.

0x6c6f6c · 2025-05-06T03:32:23 1746502343

More like "single process application's database".

There are interesting use cases for DB-per-user which can be server or client side, or litestream's continuous backup/sync that can extend it beyond this use case a bit too.

You _can_ use SQLite as your service's sole database, if you vertically scale it up and the load isn't too much. It'll handle a reasonable amount of traffic. Once you hit that ceiling though, you'll have to rethink your architecture, and undergo some kind of migration.

The common argument for SQLite is deferring complexity of hosting until you've actually reached the type of load you have to use a more complex stack for.

outside1234 · 2025-05-05T22:16:11 1746483371

Ok, can we just. How is Databricks an AI unicorn exactly?

ivape · 2025-05-05T22:20:42 1746483642

Enterprises have lots of data. They store it somewhere, and there are multiple vendors that provide such "credible" infrastructure for this type of storage. Think of it like, your dad says he's willing to get a dog, but only trusts these-five-animal-shelters and nothing else. That doesn't mean that's correct (that those are the only places to get a dog), it just means that's what he trusts. Databricks is most likely a unicorn because they have successfully sold the idea that they are one of those trusted vendors, like Snowflake.

The truth of the 2010s up until now is that every startup was a massive sales con job. The wealth of this industry is not truly built on incredible tech, but on the audacity of salesmanship. It's a billion-dollar con job. That's one of the reasons I take every ridiculous startup that launches quite seriously, because you have no idea just how audacious their sales people are. They can sell anything.

Your question is very fundamental, and the answer is just as raw and fundamental too. I would love it if some of these sales people actually reform and write tell-alls about how they conned so many large companies in their years of working. This content has got to be out there somewhere.

woooooo · 2025-05-06T00:52:58 1746492778

So, I'm not sure if this is less cynical or more cynical, but.. have you ever talked to the decision-makers who buy something like databricks?

They can't build it themselves, and it's highly dubious that they'd be able to hire and supervise someone to build it. Databricks may be selling "nothing special", but it's needed, and the buyers can't build it themselves.

tibbar · 2025-05-06T17:48:56 1746553736

The thing is, it's actually a very difficult engineering/research/infra problem to run complicated queries on enormous data lakes. All the obvious ways to do it are prohibitively slow and expensive. Every bit of performance you can squeeze out of this, you unlock the ability for people to work with their data more easily. So there is huge value in having some centralized companies sink lots of R&D into trying to solve these problems well.

th0ma5 · 2025-05-06T04:56:22 1746507382

Is that how Databricks sees their customers? Yikes

fock · 2025-05-06T06:11:08 1746511868

I can tell you the company I work at (4000 people, legacy banking IT) has 4 people running our Datalake. We likely have more people buying/"evaluating" Databricks currently (from overhearing calls in open-plan offices), so I guess they have a point. A very sad point...

s1artibartfast · 2025-05-06T01:06:03 1746493563

My mental model is that there are few big money printing industries, and the major players and it will pay just about anything for a slight advantage. It's really about additive revenue, it's about protecting market share.