OverIQ.com

Data migrations in Django

Last updated on July 27, 2020


At this point, our SQLite database consists of a small number of languages. However, we started with the goal to allow users to create snippets for a wide variety of languages. As things stand, other than manually creating Language objects one by one, we have no better way of populating these records.

Furthermore, in the deployment, if we choose to go with a more robust database like PostgreSQL then we have to again enter languages one by one.

What we need is a way to automatically load "system data" so that our application can run successfully no matter whether we are in development or production.

Data Migrations #

Data migration is similar to ordinary migration we learned in lesson Basics of Migrations, but instead of altering database schema it changes the data in the database.

The following are two common uses cases of Data migration:

  1. Load essential data so that your application can operate correctly (this is what we need).
  2. When data needs to be updated, after changes in the models.

If you try to create a new migration via makemigrations command, you will get "No changes detected".

1
2
$ ./manage.py makemigrations
No changes detected

This is because we haven't made any changes since the last time we have run makemigrations command.

So how do we create data migration?

We can force Django to create an empty migration file using the --empty option followed by the name of the app.

1
2
3
$ ./manage.py makemigrations --empty djangobin
Migrations for 'djangobin':
  djangobin/migrations/0017_auto_20180430_1637.py

This will create a time-stamped empty migration in the migrations subdirectory of the djangobin app.

The migration file we just created is of special significance to us. However, the name of the migration file gives a false impression that it is an ordinary migration. Let's rename the file to reflect what it does.

1
2
$ cd djangobin/migrations
$ mv 0017_auto_20180430_1637.py language_data.py

Open the migration file and it should look like this:

djangobin/django_project/djangobin/migrations/language_data.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# -*- coding: utf-8 -*-
# Generated by Django 1.11 on 2018-04-30 16:37
from __future__ import unicode_literals

from django.db import migrations


class Migration(migrations.Migration):

    dependencies = [
        ('djangobin', '0016_auto_20180430_1618'),
    ]

    operations = [
    ]

As we discussed in Migrations in Django chapter, the operations list is where the meat of the migration lies but currently it is empty. Django comes with lots of built-in operations. We have already seen some of them in Migrations in Django chapter. The operation we are interested in is called RunPython.

The RunPython operation allows us to run arbitrary Python code from the migration file. It takes two functions: a forward function and a backward function. The forward function is executed when the migration is applied and the backward function is executed when the migration is rolled back.

Let's define RunPython operation. Open language_data.py and modify it as follows:

djangobin/django_project/djangobin/migrations/language_data.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
from django.db import migrations


LANGUAGES  = [
    {
        'name': 'Bash',
        'lang_code': 'bash',
        'slug': 'bash',
        'mime': 'application/x-sh',
        'file_extension': '.sh',
    },
    {
        'name': 'C',
        'lang_code': 'c',
        'slug': 'c',
        'mime': 'text/x-chdr',
        'file_extension': '.c',
    },
    {
        'name': 'C#',
        'lang_code': 'c#',
        'slug': 'c-sharp',
        'mime': 'text/plain',
        'file_extension': '.aspx,',
    },
    {
        'name': 'C++',
        'lang_code': 'c++',
        'slug': 'cpp',
        'mime': 'text/x-c++hdr',
        'file_extension': '.cpp',
    },
    #...
]


# forward function 
def add_languages(apps, schema_editor):
    Language = apps.get_model('djangobin', 'Language')

    for lang in LANGUAGES:
        l = Language.objects.get_or_create(
            name = lang['name'],
            lang_code = lang['lang_code'],
            slug = lang['slug'],
            mime = lang['mime'],
            file_extension = lang['file_extension'],
        )

        print(l)


# backward function
def remove_languages(apps, schema_editor):
    Language = apps.get_model('djangobin', 'Language')

    for lang in LANGUAGES:
        l = Language.objects.get(
            lang_code=lang['lang_code'],
        )

        l.delete()


class Migration(migrations.Migration):

    # adjust the dependencies list to refer to the correct migration file

    dependencies = [
        ('djangobin', '0016_auto_20180430_1618'),
    ]

    operations = [
        migrations.RunPython(
            add_languages,
            remove_languages
        )
    ]

Note: The code is truncated to save space. Remember, you can always see can view the full source code in the Github repo.

The forward and backward functions takes two arguments, app registry (which is an instance of django.apps.registry.Apps) and SchemaEditor.

The app registry contains the historical versions of all your models loaded into it to match where in your history the migration sits. And the SchemaEditor is what Django uses to communicate with the database.

Inside the data migration, you should always use the historical version of the model because the current version of the model might have changed in the interim. Django builds this historical model using the migration files. To load the historical version of the model we use get_model() method which takes app and model name as arguments.

Django uses historical models all the time but this is the first time we need to understand how it works.

Whenever your run makemigrations command Django compares the current version of the model with the historical version of the model stored in the migration files to figure out what's needs to be added, updated or removed from the database and then creates a migration file based on the changes it encounters.

Our data migration is ready. To apply it execute the following command:

$ ./manage.py migrate

The output will be something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, djangobin, sessions
Running migrations:
  Applying djangobin.0008_language_data...(<Language: Language object>, False)
(<Language: Language object>, True)
(<Language: Language object>, True)
(<Language: Language object>, True)
(<Language: Language object>, True)
(<Language: Language object>, True)
(<Language: Language object>, True)
(<Language: Language object>, False)
(<Language: Language object>, False)
(<Language: Language object>, True)
(<Language: Language object>, False)
(<Language: Language object>, False)
(<Language: Language object>, True)
(<Language: Language object>, True)
(<Language: Language object>, True)
(<Language: Language object>, True)
 OK