Database Migration: Schema Changes Without Downtime

Database migration is the process of evolving a database schema from one version to another. It involves applying incremental changes using migration scripts that can be rolled forward or backward.

Database Migration: Schema Changes Without Downtime

Database migration is the process of evolving a database schema from one version to another in a controlled, repeatable, and version-controlled manner. As applications grow and change, database schemas must evolve too. New features require new tables, columns, indexes, or constraints. Bug fixes may require data backfills or schema corrections. Database migration tools automate these changes, ensuring that all environments (development, staging, production) have consistent schemas and that changes are applied safely without data loss.

Without proper migration practices, teams resort to manual SQL scripts applied by hand, leading to inconsistent environments, failed deployments, and production outages. Modern migration tools treat database schema changes as code, stored in version control, and applied as part of the deployment pipeline. To understand migrations properly, it is helpful to be familiar with SQL basics, relational database design, and CI/CD pipelines.

Database migration overview:

┌─────────────────────────────────────────────────────────────┐
│                   Database Migration                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Version Control → Migration Scripts → Apply → Database     │
│                                                              │
│  Migration Types:                                            │
│  • Schema changes (CREATE, ALTER, DROP)                      │
│  • Data migrations (backfills, transformations)              │
│  • Reference data (seed data, lookup tables)                 │
│                                                              │
│  Migration Strategies:                                       │
│  • Forward-only (migrations only go up)                      │
│  • Reversible (migrations can go up and down)                │
│  • Zero-downtime (schema changes without locking)            │
│                                                              │
└─────────────────────────────────────────────────────────────┘

What Is Database Migration

Database migration is the controlled process of applying incremental changes to a database schema. Each migration represents a single, atomic change that transitions the database from one version to the next. Migrations are stored as scripts in version control, applied in order, and tracked in a special schema versions table that records which migrations have been applied.

Migration Script: A SQL file containing the changes to apply (e.g., CREATE TABLE, ALTER TABLE, INSERT).
Schema Versions Table: A table (usually `schema_migrations`) that tracks which migrations have been applied.
Forward Migration (Up): Applies a change to move the schema forward (e.g., add a column).
Rollback Migration (Down): Reverts a change to move the schema backward (e.g., drop a column).
Seed Data: Initial data inserted into tables after schema creation.

Why Database Migration Matters

Database migrations are essential for modern software development practices, enabling teams to evolve databases safely and consistently.

Consistency Across Environments: All environments (dev, staging, production) have the same schema version.
Version Control: Schema changes are stored in Git alongside application code, enabling code review and audit trails.
Automated Deployments: Migrations run automatically as part of CI/CD pipelines, eliminating manual steps.
Rollback Capability: Reversible migrations allow safe rollbacks if a deployment fails.
Team Collaboration: Multiple developers can work on schema changes without conflicts.
Zero-Downtime Deployments: Careful migration design allows schema changes without application downtime.

Migration File Naming Conventions

Common naming patterns:

# Timestamp-based (most common)
20240115103000_create_users_table.sql
20240115120000_add_email_to_users.sql
20240115150000_create_orders_table.sql

# Sequential numbers
001_create_users_table.sql
002_add_email_to_users.sql
003_create_orders_table.sql

# Semantic versioning
v1.0.0__create_users_table.sql
v1.1.0__add_email_to_users.sql
v2.0.0__create_orders_table.sql

Popular Migration Tools

Tool	Language	Features	Best For
Flyway	Java (but language-agnostic)	SQL-based migrations, repeatable migrations, callbacks, Java-based migrations	Enterprise, Java shops, any language
Liquibase	Java	XML/YAML/JSON/SQL formats, context-aware, rollback support	Complex migration needs, enterprise
Alembic (Python)	Python	SQLAlchemy integration, autogenerate from models, branch support	Python applications (Flask, Django)
goose (Go)	Go	Simple, SQL-based, embeddable	Go applications
Knex.js (Node.js)	JavaScript	Schema builder, seed files, CLI	Node.js applications
Doctrine Migrations (PHP)	PHP	Diff generation, version control, namespace support	PHP applications (Symfony, Laravel)

Migration Examples by Tool

Flyway (SQL-based)

-- V1__create_users_table.sql
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- V2__add_status_to_users.sql
ALTER TABLE users ADD COLUMN status VARCHAR(20) DEFAULT 'active';

-- V3__create_orders_table.sql
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INT NOT NULL REFERENCES users(id),
    total DECIMAL(10,2) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

# Apply migrations
flyway migrate

# Check status
flyway info

Liquibase (XML format)

<?xml version="1.0" encoding="UTF-8"?>
<databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
                   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                   xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
                   http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.0.xsd">

    <changeSet id="1" author="dev">
        <createTable tableName="users">
            <column name="id" type="int" autoIncrement="true">
                <constraints primaryKey="true"/>
            </column>
            <column name="name" type="varchar(100)">
                <constraints nullable="false"/>
            </column>
            <column name="email" type="varchar(255)">
                <constraints unique="true" nullable="false"/>
            </column>
        </createTable>
    </changeSet>

    <changeSet id="2" author="dev">
        <addColumn tableName="users">
            <column name="status" type="varchar(20)" defaultValue="active"/>
        </addColumn>
    </changeSet>

</databaseChangeLog>

Alembic (Python)

# alembic/versions/001_create_users_table.py
from alembic import op
import sqlalchemy as sa

def upgrade():
    op.create_table(
        'users',
        sa.Column('id', sa.Integer(), nullable=False),
        sa.Column('name', sa.String(100), nullable=False),
        sa.Column('email', sa.String(255), nullable=False, unique=True),
        sa.Column('created_at', sa.DateTime(), server_default=sa.text('CURRENT_TIMESTAMP')),
        sa.PrimaryKeyConstraint('id')
    )

def downgrade():
    op.drop_table('users')

Zero-Downtime Migration Strategies

Zero-downtime migrations allow schema changes without locking tables or causing application errors. They require careful planning and often involve multiple deployment phases.

1. Add Column (with Default)

-- Safe: Add column with default value (not nullable)
ALTER TABLE users ADD COLUMN status VARCHAR(20) DEFAULT 'active';

-- Risky: Add NOT NULL without default (fails if rows exist)
ALTER TABLE users ADD COLUMN status VARCHAR(20) NOT NULL;  -- BAD

2. Add Column (Nullable First, Then Backfill)

-- Step 1: Add nullable column
ALTER TABLE users ADD COLUMN timezone VARCHAR(50);

-- Step 2: Backfill data in batches (application or migration script)
UPDATE users SET timezone = 'UTC' WHERE timezone IS NULL LIMIT 1000;

-- Step 3: Make column NOT NULL after backfill
ALTER TABLE users ALTER COLUMN timezone SET NOT NULL;

3. Rename Column (Phases)

-- Phase 1: Add new column, keep old
ALTER TABLE users ADD COLUMN full_name VARCHAR(200);
UPDATE users SET full_name = name;

-- Phase 2: Deploy application that writes to both columns
-- Application code writes to both name and full_name

-- Phase 3: Stop writing to old column, deploy read from new only

-- Phase 4: Drop old column
ALTER TABLE users DROP COLUMN name;

4. Create Index Without Locking

-- PostgreSQL: CONCURRENTLY (does not block writes)
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

-- MySQL: ALGORITHM=INPLACE, LOCK=NONE
ALTER TABLE users ADD INDEX idx_email (email), ALGORITHM=INPLACE, LOCK=NONE;

-- SQL Server: ONLINE = ON
CREATE INDEX idx_users_email ON users(email) WITH (ONLINE = ON);

5. Drop Column (Safe)

-- Step 1: Stop writing to column (application change)
-- Step 2: Deploy application that ignores the column
-- Step 3: Drop column after confirming no reads/writes
ALTER TABLE users DROP COLUMN deprecated_field;

Data Migration (Backfills)

Data migrations transform existing data when schemas change. They often involve processing large tables in batches to avoid long-running transactions.

Batch backfill example (PostgreSQL):

-- Add column first
ALTER TABLE orders ADD COLUMN total_with_tax DECIMAL(10,2);

-- Batch backfill (run multiple times)
UPDATE orders
SET total_with_tax = total * 1.10
WHERE total_with_tax IS NULL
LIMIT 1000;

-- After backfill complete, make NOT NULL
ALTER TABLE orders ALTER COLUMN total_with_tax SET NOT NULL;

Complex data transformation:

-- Split name column into first_name and last_name
ALTER TABLE users ADD COLUMN first_name VARCHAR(50);
ALTER TABLE users ADD COLUMN last_name VARCHAR(50);

-- Data transformation
UPDATE users
SET
    first_name = SPLIT_PART(name, ' ', 1),
    last_name = SUBSTRING(name FROM POSITION(' ' IN name) + 1)
WHERE name LIKE '% %';

-- After validation, drop old column
ALTER TABLE users DROP COLUMN name;

Migration Best Practices

Keep Migrations Small: Each migration should do one thing. Small migrations are easier to review, test, and rollback.
Make Migrations Reversible: Always write a down migration (rollback) for each up migration.
Test Migrations: Test migrations on a copy of production data before applying to production.
Use Idempotent Migrations: Migrations should be safe to run multiple times (using IF EXISTS, IF NOT EXISTS).
Never Edit Committed Migrations: Once a migration is committed, create a new migration for changes.
Backup Before Migrations: Always have a recent backup before running migrations on production.
Monitor Migration Duration: Long-running migrations can cause downtime. Use batched operations.
Run Migrations During Low Traffic: Schedule migrations during off-peak hours when possible.

Idempotent migration example:

-- Safe: Can be run multiple times
CREATE TABLE IF NOT EXISTS users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL
);

-- Safe: Add column if not exists
DO $$
BEGIN
    IF NOT EXISTS (SELECT 1 FROM information_schema.columns
                   WHERE table_name = 'users' AND column_name = 'email') THEN
        ALTER TABLE users ADD COLUMN email VARCHAR(255);
    END IF;
END $$;

-- Safe: Create index if not exists
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_email ON users(email);

Common Migration Mistakes to Avoid

Locking Large Tables: Adding columns with defaults or creating indexes without CONCURRENTLY locks tables.
Non-Reversible Migrations: Migrations without rollback scripts make it impossible to revert failed deployments.
Editing Committed Migrations: Changing a migration after it has been applied causes inconsistency between environments.
Long-Running Transactions: Large data migrations in single transactions cause locks and replication lag.
No Backup Before Migration: Migration failures can corrupt data. Always have a recent backup.
Deploying App and Schema Simultaneously: Schema changes must be backward compatible with the previous application version.
Not Testing Rollbacks: Test that rollback scripts work before relying on them in production.

Integrating Migrations with CI/CD

GitHub Actions example (Flyway):

name: Database Migration

on:
  push:
    branches: [ main ]
    paths:
      - 'migrations/**'

jobs:
  migrate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Run Flyway migrations
        uses: flyway/flyway-action@v1
        with:
          url: ${{ secrets.DB_URL }}
          user: ${{ secrets.DB_USER }}
          password: ${{ secrets.DB_PASSWORD }}
          locations: filesystem:./migrations

GitLab CI example:

# .gitlab-ci.yml
migrate-database:
  stage: deploy
  script:
    - flyway -url=$DB_URL -user=$DB_USER -password=$DB_PASSWORD migrate
  only:
    - main
  environment: production

Frequently Asked Questions

What is the difference between migration and seeding?
Migrations change database schema (structure). Seeding populates tables with initial data (content). Both are often used together.
Can I roll back a migration?
Yes, if you have a down migration script. Most migration tools support rollback commands (flyway undo, liquibase rollback).
What happens if a migration fails?
Most migration tools track which migrations succeeded. Failed migrations are not marked as applied. Fix the issue and rerun.
How do I handle migrations in a team?
Use version control, sequential naming, and never edit committed migrations. If two developers create migrations, resolve conflicts like code conflicts.
Should I run migrations before or after deploying application code?
Run migrations before deploying new code. The new code expects the new schema. Ensure migrations are backward compatible with the old code.
What should I learn next after database migrations?
After mastering migrations, explore zero-downtime deployment strategies, database testing, CI/CD pipelines, and database version control for complete database lifecycle management.

Database Migration: Schema Changes Without Downtime