Zero downtime migrations

darkwater · on May 5, 2022

If you are using MySQL I also recommend Percona's Online Schema Change [1] which is basically a robust implementation of this post's algorithm, handles more edge-cases, has been existing for at least 10 years now and works better out of the box with AWS RDS than Gh-ost. (if you have production experience on 100s of millions rows tables with Gh-ost on RDS please share your experience/tricks, I'd love to hear them!)

[1] https://www.percona.com/doc/percona-toolkit/3.0/pt-online-sc...

boudin · on May 5, 2022

I have been using gh-ost with RDS for years no with no issues. What problems did you have?

darkwater · on May 5, 2022

Actually I never tried but I was scared by the small print of GH not using RDS themselves [1] and Ghost relying on lower-level features that might be not easily available in RDS. Also I had the impression you have to setup a normal non-RDS replica attached to your RDS master?

[1] https://github.com/github/gh-ost/blob/master/doc/rds.md

WJW · on May 5, 2022

At my previous employer we've run it just fine from an EC2 instance talking to a MySQL RDS database. At some point the triggers from pt-osc were really hurting when trying to migrate big (several billion rows) table with heavy traffic and gh-ost really helped with that. That said, gh-ost is definitely a big boy tool and it took us a few tries to get the settings just right. No downtime or anything, we just had to delete the new table and start over. Still annoying when it wastes several hours :)

grepfru_it · on May 6, 2022

GH no longer uses RDS. They had some services which relied on that maybe 3 years ago. Migrations were performed with gh-ost, but if not running replicas (RDS was not) it would downgrade to mysql binaries

GH is migrating from gh-ost in general, as it’s not even used in the enterprise products (though GitHub.com relies on it, the next iteration will phase it out)

boudin · on May 5, 2022

It works with a standard RDS (not aurora) replica. There are a few constraints when using gh-ost but it's all documented.

earless1 · on May 5, 2022

It also works fine with Aurora, as long as you have binary logging enabled.

evanelias · on May 5, 2022

That's true. However, if you don't already have any "traditional" (binlog-based) replicas on Aurora, you might not even need gh-ost or pt-osc in the first place -- at least for many common ALTER operations.

Aurora supports physical replication at the storage level, which means you can use MySQL online DDL (ALGORITHM=INPLACE, LOCK=NONE) without having to worry about replication lag. Recent blog post with findings from Percona: https://www.percona.com/blog/zero-impact-on-index-creation-w...

Caveats:

* This is safe on Aurora 3 due to MySQL 8's atomic DDL support. On previous versions, there may be crash-safety risks, which make external tools safer for altering very large tables.

* External OSC tools can provide other advantages, such as throttling, safer to cancel, ability to time the final table switchover, ability to rollback (pt-osc supports "reverse" triggers), etc.

* Some forms of ALTER TABLE don't support online DDL, and for those you'd still need an external OSC tool like gh-ost or pt-osc, but it's less common.

ransom1538 · on May 5, 2022

^ Yes. Percona's Online Schema Change is the best tool on the market. Replication checking, setup with triggers, online, etc.

harryvederci · on May 5, 2022

Something I'm doing in my CV application is using 1 SQLite DB file per user.

When a user logs in, a check is done: "Does this user's DB file have the latest migration?" If not, the migrations are applied. That way, you only get a slight delay as a user when a migration is needed. None of the other users their DB files are affected. More technical details are in the FAQ: https://withoutdistractions.com/cv/faq

In terms of the article: I'm only changing 1 tire at a time at 100mph, not all of them.

PS: I recently did a "Show HN" about the app, it got some interesting feedback: https://news.ycombinator.com/item?id=31246696

fer · on May 5, 2022

> Something I'm doing in my CV application is using 1 SQLite DB file per user.

That's the "easy" case where most (all?) data is easily separable/shardeable, in this case by user. Once you start having relationships between users (messages, likes, groups...) everything starts getting warty and you need approaches like that of TFA.

harryvederci · on May 5, 2022

Correct! I didn't' mean to say it's a good fit for every use case.

For me, it works. For someone else, it may be a terrible idea. Always good to think things through, and there's no shame in going for something more commonly used. Boring isn't bad, it's often a good choice.

neurostimulant · on May 5, 2022

This seems great for application where you (as a dev) has absolutely no need to ever access any of your user's data. However, if you have a process that needs to grab some users data, you'll have no choice but to make sure all users have similar db schema (or centralize within the same database).

ramraj07 · on May 5, 2022

Wait , are you running this site on a raspberry pi?

The biggest issue with SQLite is what happens if your server serving stuff goes down mid write? Power surge maybe. You lose data from that day till previous cronjob?

lights0123 · on May 5, 2022

> The biggest issue with SQLite is what happens if your server serving stuff goes down mid write?

Why would that matter? SQLite is still fully ACID compliant like other DBs.

WJW · on May 5, 2022

I imagine that GP means a hardware failure severe enough that it permanently borks the disk drive of the machine. In that case vanilla SQLite would indeed lose data written since the time of the last offsite backup. MySQL/Postgres/etc typically have streaming replication which means you can promote one of the replicas and not lose any (or at least much less) data. There are some new-ish projects like Litestream (https://litestream.io/) that provide a similar functionality for SQLite by streaming the WAL file changes to S3.

harryvederci · on May 5, 2022

The "boots up our application within a second on a Raspberry Pi" comment in the FAQ[0] is actually about my development setup, as I don't own a fancy computer myself. It was mainly meant to illustrate the fast startup time of the Janet[1] language. In fact, that startup time is so fast, you can run your app as a CGI script. That has some more possible advantages, see the FAQ for more details.

[0] FAQ: https://withoutdistractions.com/cv/faq#technology

[1] Janet: https://janet-lang.org/

scrapheap · on May 5, 2022

They do gain some data security as they've effectively sharded their database by user. So at the point of a power failure only the active users' data would be at risk of corruption at the file level.

Of course there's always the risk of damage to the media in general, but then you can never fully protect against hardware failures.

jbverschoor · on May 5, 2022

I dunno.. people seem to like 0 downtime migrations, but really, for most companies a little bit of downtime or degraded functionality doesn't matter. Most applications aren't even large enough for the downtime to be more than a couple of minutes.

Having some scheduled downtime saves you a lot of complexity of writing, monitoring, and finalizing these migrations. It also makes it a lot easier to have a consistent state in terms of code+data.

The article doesn't mention how to deal with different datamodels / constraints etc.

grogers · on May 5, 2022

Even if it takes a little more time to do the migration online, it's way less stressful. If something goes wrong during any stage, just roll back and figure it out tomorrow. If something goes wrong when you have turned everything off, you pretty much have to solve the issue now and keep going forward, or do the entire rollback to get to a working state.

jbverschoor · on May 5, 2022

Sure, but 99% of the databases are small enough to have some degration/downtime/exceptions. If something goes wrong, just roll back.

There's no difference in that. "Zero downtime migrations" like only cover adding columns.

Let's say you change a relation from n-1 to a n-m.. this is not gonna save you. You need to deploy a new version of the code. If you want to roll back, you might loose data, or some code doesn't work. It's just a mess, takes more time, is more error prone.

Most companies are not "big tech".

KiranRao0 · on May 5, 2022

Author here, and a few things I'd like to clear up.

> 99% of the databases are small enough to have some degration/downtime/exceptions

I agree that most DBs are small enough to perform the migration operation in a single transaction. However the choice to have downtime isn't solely an engineering question. It's also a product/business consideration.

> Let's say you change a relation from n-1 to a n-m.. this is not gonna save you. You need to deploy a new version of the code. If you want to roll back, you might loose data, or some code doesn't work. It's just a mess, takes more time, is more error prone.

Agreed. This article isn't meant to cover every possible migration, but a good starting point for most of them. Gives a framework to make think about how to implement n:1 -> n:m

> Most companies are not "big tech".

I'm not working in big tech. I'd consider myself working firmly within small tech. And these technique exactly the same if we had exactly 1 API server and 1 small database instance.

CipherThrowaway · on May 5, 2022

I prefer the online approach regardless of downtime. Breaking the migration into incremental and backwards compatible steps is less stressful with smaller tail risk exposure than trying to do it as a single state transition. I think the parallel change approach seems more complicated at first but ends up taking less work and less effort in the long run because of increased safety and incrementalism.

XorNot · on May 5, 2022

It's simpler then that: "zero downtime" isn't free, and has a direct cost to implement - one you potentially pay every single migration, for which you do not necessarily make enough or any money back on.

In particular most SaaS providers with a subscription model would be hard-pressed to care: they're not selling ads per click, so provided you don't lose users over it, there's zero value. In fact it's probably more valuable to take the downtime and use the savings in dev time and effort to ensure you have an expedient rollback and recovery strategy - that will cost you users.

WJW · on May 5, 2022

At least for Rails there are several gems available (ie https://github.com/WeTransfer/ghost_adapter or https://github.com/departurerb/departure) that seamlessly hook into the existing migration system and will run all eligible migrations through gh-ost or pt-osc as needed. You're right that it's not free but it isn't all that far off either.

That said, online schema migrations are a specialized tool designed for very big tables that take hours to run an ALTER TABLE on. If all your tables are small enough that alterations take less than a second or so, don't bother and just block the table for a bit. It's fine.

KiranRao0 · on May 5, 2022

> "zero downtime" isn't free, and has a direct cost to implement

Thank you for writing this. I was implicitly trying to convey this in the article, but glad to have it be explicit.

nhumrich · on May 6, 2022

I used to believe this too, but the more I have done online migrations, the more I think its actually the same amount of work in most cases, and less work in the worst cases, and almost never more work.

All the steps you have to take here, you have to do anyways. Write to new, read from new, how to translate, deleting old code. It all is the same. The only difference is you do it in chunks vs all at once. Its perception. But all the components and architecting happen anyways.

Sure, you pay the cost of watching more deployments, but you also gain every step being automated, where offline migrations are often ran once, never committed. Offline migration are, like online, not free. If you have to take downtime, its usually not during business hours. You have probably quoted a downtime range to customers. So you have a window of time. You will be stressed. You will "practice", you will write a "plan". Even if everything goes well, those aren't free. But in the worst case, now you have more problems. Let's say something went wrong. Some customer data wasn't as you expected, you have to bring services back up in 20 more minutes. Do you scramble and try to fix it, late at night with limited staff? Or do you roll back? If you roll back, then you have to do this all over again, but you probably need to wait at least a week, because no one wants two planned outages back to back.

With online, every step is "safe". So if you have bugs, no worries, the old way is still working! Maybe rollback the code, but no need to rollback the migration, just leave it in its current state. Take your time, fix it, dont move on till its working.

But even if that doesn't convince you, the number one reason to do online migrations: No more late night planned outages. Do everything during business hours. My employer doesn't get to intentionally make me work when I should be sleeping. If that means less features shipped, then so-be-it.

antoncohen · on May 5, 2022

MySQL has some robust tooling in this space. Some of the tools use triggers to copy to a new table. GitHub's gh-ost[1] is probably the state of the art, and uses the binary log stream to replicate the data.

[1] https://github.com/github/gh-ost

taspeotis · on May 5, 2022

Postgres has some less robust tooling, at least both of these carry the "experimental" tag:

https://news.ycombinator.com/item?id=29825520

https://github.com/shayonj/pg-osc

But, still, they do it for you.

fabianlindfors · on May 5, 2022

Anybody interested in this subject might also be interested in a tool for Postgres I’ve been working on, Reshape: https://github.com/fabianlindfors/reshape. It aims to fully automate away all the pain and manual steps zero-downtime migrations normally requires :)

yakkomajuri · on May 5, 2022

> Now the most challenging part: explaining to Product why their seemingly small request took 3x longer than expected. If it helps you can send them this article. Good luck!

We had to solve this problem in a way that also took our self-hosted users into account. Essentially "changing Tires at 100mph" in environments we don't control. Still polishing it but will plug a post about it here if anyone finds it relevant:

https://posthog.com/blog/async-migrations

stalluri · on May 5, 2022

I think PlanetScale folks have blogged well about how schema migration tools work and their traffic splitting, rewinding abilities are very nice.

https://docs.planetscale.com/learn/how-online-schema-change-...

https://planetscale.com/blog/its-fine-rewind-revert-a-migrat...

kasey_junk · on May 5, 2022

When I was first starting out writing applications it was very common for the DBA in charge of a database to require all writes to go to a stored procedure and all reads to go through views. It was at times onerous but it made migrations like these a snap. With most migrations/rdms it could even be atomic.

tzahifadida · on May 5, 2022

I thought of another trick. Haven"t tried it, should work though. Where applicable put tenants on multi schema. If the tables are now very small in comparison then a write lock is not that important. Simply migrate. Do a blue green deployment between code versions. Done.

One downside is the downgrade part. If you still want it you have to do it as in the article.

ewuhic · on May 5, 2022

Side Question - could anyone suggest a tool for db cloning in context of on-commit preview environments?

I am only aware of this tool: https://github.com/postgres-ai/database-lab-engine

But it looks like too much manual work to do.

jfk13 · on May 5, 2022

Ah, it's about databases... the title here immediately made me think of a much more hardcore "migration" effort: https://www.amusingplanet.com/2019/10/an-incredible-move-ind...

dboreham · on May 5, 2022

Keeps happening: something that had a name (schema evolution) gets given another name by people who never heard the first name. Recurse.