🚜 Refactoring and fiddling with Django migrations for pending pull requests 🐘

One of Django’s most powerful features is the ORM, which includes a robust migration framework. One of Django’s most misunderstood features is Django migrations because it just works 99% of the time.

Even when working solo, Django migrations are highly reliable, working 99.9% of the time and offering better uptime than most web services you may have used last week.

The most common stumbling block for developers of all skill levels is rolling back a Django migration and prepping a pull request for review.

I’m not picky about pull requests or git commit history because I default to using the “Squash and merge” feature to turn all pull request commits into one merge commit. The merge commit tells me when, what, and why something changed if I need extra context.

I am pickier about seeing >2 database migrations for any app unless a data migration is involved. It’s common to see 4 to 20 migrations when someone works on a database feature for a week. Most of the changes tend to be fiddly, where someone adds a field, renames the field, renames it again, and then starts using it, which prompts another null=True change followed by a blank=True migration.

For small databases, none of this matters.

For a database with 10s or 100s of millions of records, these small changes can cause minutes of downtime per migration, which amounts to a throwaway change. While there are ways to mitigate most migration downtime situations, that’s different from my point today.

I’m also guilty of being fiddly with my Django model changes because I know I can delete and refactor them before requesting approval. The process I use is probably worth sharing because once every new client comes up.

Let’s assume I am working on Django News Jobs, and I am looking over my pull request one last time before I ask someone to review it. That’s when I noticed four migrations that could quickly be rebuilt into one, starting with my 0020* migration in my jobs app.

The rough steps that I would do are:

# step 1: see the state of our migrations
$ python -m manage showmigrations jobs
jobs
 [X] 0001_initial
 ...
 [X] 0019_alter_iowa_versus_unconn
 [X] 0020_alter_something_i_should_delete
 [X] 0021_alter_uconn_didnt_foul
 [X] 0022_alter_nevermind_uconn_cant_rebound
 [X] 0023_alter_iowa_beats_uconn
 [X] 0024_alter_south_carolina_sunday_by_four

# step 2: rollback migrations to our last "good" state
$ python -m manage migrate jobs 0019

# step 3: delete our new migrations
$ rm jobs/migrations/002*

# step 4: rebuild migrations 
python -m manage makemigrations jobs 

# step 5: profit 
python -m manage migrate jobs

95% of the time, this is all I ever need to do.

Occasionally, I check out another branch with conflicting migrations, and I’ll get my local database in a weird state.

In those cases, check out the --fake (“Mark migrations as run without actually running them.") and --prune (“Delete nonexistent migrations from the django_migrations table.") options. The fake and prune operations saved me several times when my django_migrations table was out of sync, and I knew that SQL tables were already altered.

What not `squashmigrations`?

Excellent question. Squashing migrations is wonderful if you care about keeping every or most of the operations each migration is doing. Most of the time, I do not, so I overlook it.

Jeff Triplett's Micro.blog

Django

Python

🚜 Refactoring and fiddling with Django migrations for pending pull requests 🐘

What not `squashmigrations`?

What not squashmigrations?

What not `squashmigrations`?