Defining Schema in SQLAlchemy Core

Last updated on July 27, 2020

Creating tables #

Tables in SQLAlchemy are represented as an instance of the Table class. The Table constructor function accepts table name, metadata and one or more columns as arguments. Here is an example:

from sqlalchemy import MetaData, Table, String, Column, Text, DateTime, Boolean
from datetime import datetime

metadata = MetaData()

blog = Table('blog', metadata, 
    Column('id', Integer(), primary_key=True),
    Column('post_title', String(200), nullable=False),
    Column('post_slug', String(200),  nullable=False),
    Column('content', Text(),  nullable=False),
    Column('published', Boolean(),  default=False),
    Column('created_on', DateTime(), default=datetime.now)
    Column('updated_on', DateTime(), default=datetime.now, onupdate=datetime.now)
)

Let's step through the code line by line:

In line 1, we are importing several classes from the sqlalchemy package, which we will use to define the table.
In line 2, we are importing datetime class from the datetime package.
In line 4, we are creating a MetaData object. The MetaData object holds all the information about the database and the tables it contains. We use MetaData instance to create or drop tables in the database.
In lines 6-14, we are defining the table schema. The columns of a table are created using Column instance. The Column constructor function accepts name and type. We can also pass additional arguments to it to define constraints and SQL constructs. The following table lists some commonly used SQL constraints.

Constraint	Description
`primary_key`	A boolean argument, when set to `True`, marks the column as the primary of the table. To create a composite primary key, set `primary_key` to `True` on each column involved in the key.
`nullable`	A boolean argument when set to `False` adds `NOT NULL` constraint while creating a column. Its default value is `True`.
`default`	It specifies the default value to use in case the value for the column is not specified while inserting a new row. It can take a scalar value or a Python callable.
`onupdate`	It specifies the default value for the column if no value is provided to the column while updating the row. Just like the `default` keyword argument, it can take a scalar value or a Python callable.
`unique`	A boolean argument if set to `True` ensures that that values in the column must be unique.
`index`	If set to `True` creates an indexed column. Its default value is `False`.
`autoincrement`	It adds the `auto increment` option to the column. Its default value is `auto` which means that it will automatically increment primary key every time a new record is added. If you want to auto increment all the columns of involved in the composite primary key, set this argument to `True` on each of the columns. To disable the auto-increment counter set it to `False`.

Column Types #

The type dictates what kind of data a column can take. SQLAlchemy provides an abstraction for a wide range of types. Broadly speaking there are three categories of types.

Generic Types
SQL Standard Types
Vendor Specific Types

Generic Types #

The Generic types refer to the common types which major database backends support. When we use generic types, SQLAlchemy uses the best available type on the database backend while creating tables. For example, in the preceding snippet, we have defined published column as a Boolean. The Boolean is a generic type. If we run the code against PostgreSQL database then SQLAlchemy will use the boolean type provided by PostgreSQL. On the other hand, if we run the code against MySQL, then SQLAlchemy will use SMALLINT type because MySQL doesn't support Boolean type. In the Python code, however, Boolean generic type is represented using bool type (True or False).

The following table lists some generic column types provided by SQLAlchemy and its associated type in Python and SQL.

SQLAlchemy	Python	SQL
`BigInteger`	`int`	`BIGINT`
`Boolean`	`bool`	`BOOLEAN` or `SMALLINT`
`Date`	`datetime.date`	`DATE`
`DateTime`	`datetime.datetime`	`DATETIME`
`Integer`	`int`	`INTEGER`
`Float`	`float`	`FLOAT` or `REAL`
`Numeric`	`decimal.Decimal`	`NUMERIC`
`Text`	`str`	`TEXT`

We can access generic types either from sqlalchemy.types or sqlalchemy package.

SQL Standard Types #

The types defined in this category comes directly from the SQL Standard. Very small number of database backends support these types. Unlike generic types, SQL Standard Types are not guaranteed to work in all database backends.

Just like generic types, you can access these types either from sqlalchemy.types or sqlalchemy package. However, to differentiate them from generic types, names of Standard types are written in all uppercase letters. For example, SQL Standard defines a column of type Array. But currently, only PostgreSQL support this type.

from sqlalchemy import ARRAY

employee = Table('employees', metadata,
    Column('id', Integer(), primary_key=True),
    Column('workday', ARRAY(Integer)),
)

Vendor Specific Types #

This category defines types that are specific to the database backend. We can access vendor-specific types from sqlalchemy.dialects package. For example, PostgreSQL provides an INET type to store network addresses. To use it we first have to import it from sqlalchemy.dialects package.

from sqlalchemy.dialects import postgresql

comments = Table('comments', metadata,
    Column('id', Integer(), primary_key=True),
    Column('ipaddress', postgresql.INET),
)

Defining Relationships #

Database tables rarely exist on their own. Most of the time they are connected with one or more tables through various relationships. Primarily, there are three types of relationship that can exist between tables:

One-to-One relationship
One-to-Many relationship
Many-to-Many relationship

Let's see how we can define these relationships in SQLAlchemy.

One-to-Many Relationship #

Two tables are related via a one-to-many relationship if a row in the first table is related to one or more rows in the second table. In the figure below, a one-to-many relationship exists between the users table and the posts table.

To create a one-to-many relationship pass a ForeignKey object containing the name of the referenced column, to the Column constructor function.

user = Table('users', metadata,
    Column('id', Integer(), primary_key=True),
    Column('user', String(200), nullable=False),
)

posts = Table('posts', metadata,
    Column('id', Integer(), primary_key=True),
    Column('post_title', String(200), nullable=False),
    Column('post_slug', String(200),  nullable=False),
    Column('content', Text(),  nullable=False),
    Column('user_id', ForeignKey("users.id")),
)

In the above code, we are defining a foreign key on the user_id column of the posts table. It means that the user_id column can only contain values from the id column of the users table.

Instead of passing column name as string we could also pass the Column object directly to the ForeignKey constructor. For example:

user = Table('users', metadata,
    Column('id', Integer(), primary_key=True),
    Column('user', String(200), nullable=False),
)

posts = Table('posts', metadata,
    Column('id', Integer(), primary_key=True),
    Column('post_title', String(200), nullable=False),
    Column('post_slug', String(200),  nullable=False),
    Column('content', Text(),  nullable=False),
    Column('user_id', Integer(), ForeignKey(user.c.id)),
)

The user.c.id refers to the id column of the users table. In doing so, remember that the definition of the referenced column (user.c.id) must come before the referencing column (posts.c.user_id).

One-to-One Relationship #

Two tables are related via one-to-one relationship if a row in the first table is related to only one row in the second table. In the figure below, a one-to-one relationship exists between the employees table and the employee_details table. The employees table contains employees records which are public whereas the employee_details contains records which are private.

employees = Table('employees', metadata,
    Column('employee_id', Integer(), primary_key=True),
    Column('first_name', String(200), nullable=False),
    Column('last_name', String(200), nullable=False),
    Column('dob', DateTime(), nullable=False),
    Column('designation', String(200), nullable=False),
)

employee_details = Table('employee_details', metadata,
    Column('employee_id', ForeignKey('employees.employee_id'), primary_key=True, ),
    Column('ssn', String(200), nullable=False),
    Column('salary', String(200), nullable=False),
    Column('blood_group', String(200), nullable=False),
    Column('residential_address', String(200), nullable=False),    
)

To establish a one-to-one relationship, we define a primary key and foreign key on the same column in the employee_details table.

Many-to-Many Relationship #

Two tables are related via a many-to-many relationship if a row in the first table is related to one or more row in the second table. In addition to that, a row in the second table can be related to one or more table in the first table. To define a many-to-many relationship we use an association table. In the figure below, a many-to-many relationship exists between the posts and the tags table.

posts = Table('posts', metadata,
    Column('id', Integer(), primary_key=True),
    Column('post_title', String(200), nullable=False),
    Column('post_slug', String(200),  nullable=False),
    Column('content', Text(),  nullable=False),    
)

tags = Table('tags', metadata,
    Column('id', Integer(), primary_key=True),
    Column('tag', String(200), nullable=False),
    Column('tag_slug', String(200),  nullable=False),    
)

post_tags = Table('post_tags', metadata,
    Column('post_id', ForeignKey('posts.id')),
    Column('tag_id', ForeignKey('tags.id'))
)

You might have noticed that the process of defining the relationship is almost identical to the way we define a relationship in SQL. This is because we using SQLAlchemy Core and the Core lets you do things in the same way as you would do in SQL.

Defining Constraint at the Table level #

In the preceding sections, we have seen how to add constraints and indexes to a column by passing additional arguments to the Column constructor function. It turns out that just as in SQL, we can define constraints and index at the table level. The following table lists some common constraints and the name of the class to create them:

Constraint/Indexes	Class Name
Primary Key Constraint	`PrimaryKeyConstraint`
Foreign Key Constraint	`ForeignKeyConstraint`
Unique Constraint	`UniqueConstraint`
Check Constraint	`CheckConstraint`
Index	`Index`

We can access these classes either from sqlalchemy.schema or sqlalchemy package. Here are some examples of how to use them:

Adding Primary Key Constraint using PrimaryKeyConstraint #

parent = Table('parent', metadata,
    Column('acc_no', Integer()),
    Column('acc_type', Integer(), nullable=False),
    Column('name', String(16), nullable=False),
    PrimaryKeyConstraint('acc_no', name='acc_no_pk')
)

Here we are creating a primary key on the acc_no column. This above code is equivalent to the following:

parent = Table('parent', metadata,
    Column('acc_no', Integer(), primary=True),
    Column('acc_type', Integer(), nullable=False),
    Column('name', String(16), nullable=False),   
)

Mainly, the PrimaryKeyConstraint is used to define a composite primary key (a primary key spanning more than one columns). For example:

parent = Table('parent', metadata,
    Column('acc_no', Integer, nullable=False),
    Column('acc_type', Integer, nullable=False),
    Column('name', String(16), nullable=False),
    PrimaryKeyConstraint('acc_no', 'acc_type', name='uniq_1')
)

This code is equivalent to the following:

parent = Table('parent', metadata,
    Column('acc_no', Integer, nullable=False, primary_key=True),
    Column('acc_type', Integer, nullable=False, primary_key=True),
    Column('name', String(16), nullable=False),   
)

Creating Foreign Key using ForeignKeyConstraint #

parent = Table('parent', metadata,
    Column('id', Integer, primary_key=True),
    Column('name', String(16), nullable=False)
)

child = Table('child', metadata,
    Column('id', Integer, primary_key=True),
    Column('parent_id', Integer, nullable=False),
    Column('name', String(40), nullable=False),
    ForeignKeyConstraint(['parent_id'],['parent.id'])
)

Here we are creating a Foreign Key on the parent_id column which references id column of the parent table. The above code is equivalent to the following:

parent = Table('parent', metadata,
    Column('id', Integer, primary_key=True),
    Column('name', String(16), nullable=False)
)

child = Table('child', metadata,
    Column('id', Integer, primary_key=True),
    Column('parent_id', ForeignKey('parent.id'), nullable=False),
    Column('name', String(40), nullable=False),   
)

The real utility of ForeignKeyConstraint comes into play when you want to define composite foreign key (a foreign key which consists of more than one columns). For example:

parent = Table('parent', metadata,
    Column('id', Integer, nullable=False),
    Column('ssn', Integer, nullable=False),
    Column('name', String(16), nullable=False),    
    PrimaryKeyConstraint('id', 'ssn', name='uniq_1')
)

child = Table('child', metadata,
    Column('id', Integer, primary_key=True),
    Column('name', String(40), nullable=False),
    Column('parent_id', Integer, nullable=False),
    Column('parent_ssn', Integer, nullable=False),
    ForeignKeyConstraint(['parent_id','parent_ssn'],['parent.id', 'parent.ssn'])
)

Note that passing ForeignKey object to individual columns would not create a composite foreign key, instead, it would create separate foreign keys.

Creating Unique Constraint using UniqueConstraint #

parent = Table('parent', metadata,
    Column('id', Integer, primary_key=True),
    Column('ssn', Integer, nullable=False),
    Column('name', String(16), nullable=False),
    UniqueConstraint('ssn', name='unique_ssn')
)

Here we are defining unique constraint on the ssn column. The optional name keyword argument is used to provide a name to the unique constraint. The above code is equivalent to the following:

parent = Table('parent', metadata,
    Column('id', Integer, primary_key=True),
    Column('ssn', Integer, unique=True, nullable=False),
    Column('name', String(16), nullable=False),    
)

The UniqueConstraint is commonly used to define unique constraint containing multiple columns. For example:

parent = Table('parent', metadata,
    Column('acc_no', Integer, primary_key=True),
    Column('acc_type', Integer, nullable=False),
    Column('name', String(16), nullable=False),
    UniqueConstraint('acc_no', 'acc_type', name='uniq_1')
)

Here, I define a unique constraint on acc_no and acc_type, as a result, acc_no and acc_type, when taken together must be unique.

Creating Check Constraint with CheckConstraint #

A CHECK constraint allows us to define a condition that will be evaluated while inserting or updating data. If the condition evaluates to true then the data is saved into the database. Otherwise, an error is raised.

We can add CHECK constraint using the CheckConstraint construct.

employee = Table('employee', metadata,
    Column('id', Integer(), primary_key=True),
    Column('name', String(100), nullable=False),
    Column('salary', Integer(), nullable=False),
    CheckConstraint('salary < 100000', name='salary_check')
)

Creating indexes using Index #

The index keyword argument which we learned in earlier in this lesson allows us to add index on per column basis. Another way to define index is to use the Index construct. For example:

a_table = Table('a_table', metadata,
    Column('id', Integer(), primary_key=True),
    Column('first_name', String(100), nullable=False),
    Column('middle_name', String(100)),
    Column('last_name', String(100), nullable=False),
    Index('idx_col1', 'first_name')  
)

Here we are creating an index on the first_name column. This code is equivalent to the following:

a_table = Table('a_table', metadata,
    Column('id', Integer(), primary_key=True),
    Column('first_name', String(100), nullable=False, index=True),
    Column('middle_name', String(100)),
    Column('last_name', String(100), nullable=False),    
)

If your queries involved searching through a particular set of fields then you might get a performance boost by creating a composite index (i.e. an index on multiple columns), which is the primary purpose of Index. Here is an example:

a_table = Table('a_table', metadata,
    Column('id', Integer(), primary_key=True),
    Column('first_name', String(100), nullable=False),
    Column('middle_name', String(100)),
    Column('last_name', String(100), nullable=False),
    Index('idx_col1', 'first_name', 'last_name')  
)

Accessing Tables and Columns from MetaData #

Recall that the MetaData object holds all the information about the database and the tables it contains. We can access the table objects, using the following two attributes.

Attribute	Description
`tables`	returns a dictionary-type object called `immutabledict` with table name as key and the corresponding `Table` object as value.
`sorted_tables`	returns a list of `Table` object sorted in order of foreign key dependency. In other words, the tables which have dependencies is placed first before the actual dependencies. For example, if `posts` table has a foreign key which refers to the `id` column of the `users` table, then the `users` table is placed first followed by the `posts` table.

The following script shows these two attributes in action:

from sqlalchemy import create_engine, MetaData, Table, Integer, String, Column, Text, DateTime, Boolean, ForeignKey

metadata = MetaData()

user = Table('users', metadata,
    Column('id', Integer(), primary_key=True),
    Column('user', String(200), nullable=False),
)

posts = Table('posts', metadata,
    Column('id', Integer(), primary_key=True),
    Column('post_title', String(200), nullable=False),
    Column('post_slug', String(200),  nullable=False),
    Column('content', Text(),  nullable=False),
    Column('user_id', Integer(), ForeignKey("users.id")),
)

for t in metadata.tables:
    print(metadata.tables[t])

print('-------------')  

for t in metadata.sorted_tables:
    print(t.name) # print table name

Expected Output:

posts
users
-------------
users
posts

Once we have access to the Table instance we can can easily access any relevant detail about the columns as follows:

print(posts.columns)         # return a list of columns
print(posts.c)               # same as posts.columns
print(posts.foreign_keys)    # returns a set containing foreign keys on the table
print(posts.primary_key)     # returns the primary key of the table
print(posts.metadata)        # get the MetaData object from the table
print(posts.columns.post_title.name)     # returns the name of the column
print(posts.columns.post_title.type)     # returns the type of the column

Expected Output:

['posts.id', 'posts.post_title', 'posts.post_slug', 'posts.content', 'posts.user_id']
['posts.id', 'posts.post_title', 'posts.post_slug', 'posts.content', 'posts.user_id']
{ForeignKey('users.id')}
PrimaryKeyConstraint(Column('id', Integer(), table=<posts>, primary_key=True, nullable=False))
MetaData(bind=None)
post_title
VARCHAR(200)

Creating Tables #

To create the tables stored in the MetaData instance call MetaData.create_all() method with the Engine object.

metadata.create_all(engine)

The create_all() method only creates table if it doesn't already exist in the database. That means you can call create_all() safely multiple times. Note that calling create_all() method after changing the table definition will not alter the table schema. To do that, we can use a database migration tool called Alembic. We will learn more about it in the future lesson.

We can also drop all the tables in the database using MetaData.drop_all() method.

In this tutorial we will be working with the database of an e-commerce application. The database consists of following 4 tables:

customers table stores all the information about the customer. It consists of following columns:
- id - primary key
- first_name - first name of customer
- last_name - last name of customer
- username - a unique username
- email - a unique email
- address - customer address
- town - customer town name
- created_on - date and time of account creation
- updated_on - date and time the account was last updated
items table stores information about products. It consists of following columns:
- id - primary key
- name - item name
- cost_price - cost price of item
- selling_price - selling price of item
- quantity - quantity of item in the stock
orders stores information about the orders made by the customers. It consists of following columns:
- id - primary key
- customer_id - foreign key to id column in the customers table
- date_placed - date and time the order was placed
- date_shipped - date and time the order was shipped
order_lines stores details of items in the each order. It consists of following columns:
- id - primary key
- order_id - foreign key to id column in the orders table
- item_id - foreign key to id column in the items table
- quantity - quantity of item ordered

The following is an ER diagram of the database.

Here is the complete code to create these tables.

from sqlalchemy import create_engine, MetaData, Table, Integer, String, \
    Column, DateTime, ForeignKey, Numeric, CheckConstraint

from datetime import datetime

metadata = MetaData()

engine = create_engine("postgres+psycopg2://postgres:pass@localhost/sqlalchemy_tuts")

customers = Table('customers', metadata,
    Column('id', Integer(), primary_key=True),
    Column('first_name', String(100), nullable=False),
    Column('last_name', String(100), nullable=False),
    Column('username', String(50), nullable=False),
    Column('email', String(200), nullable=False),
    Column('address', String(200), nullable=False),
    Column('town', String(50), nullable=False),
    Column('created_on', DateTime(), default=datetime.now),
    Column('updated_on', DateTime(), default=datetime.now, onupdate=datetime.now)
)


items = Table('items', metadata,
    Column('id', Integer(), primary_key=True),
    Column('name', String(200), nullable=False),
    Column('cost_price', Numeric(10, 2), nullable=False),
    Column('selling_price', Numeric(10, 2),  nullable=False),
    Column('quantity', Integer(), nullable=False),
    CheckConstraint('quantity > 0', name='quantity_check')
)


orders = Table('orders', metadata,
    Column('id', Integer(), primary_key=True),
    Column('customer_id', ForeignKey('customers.id')),
    Column('date_placed', DateTime(), default=datetime.now),
    Column('date_shipped', DateTime())
)


order_lines = Table('order_lines', metadata,
    Column('id', Integer(), primary_key=True),
    Column('order_id', ForeignKey('orders.id')),
    Column('item_id', ForeignKey('items.id')),
    Column('quantity', Integer())
)


metadata.create_all(engine)

In the next lesson, we will learn how to perform CRUD operations on the database using SQL Expression Language.

Load Comments