Database Backup and Recovery: Strategies for Data Protection
Database backup and recovery involves creating copies of database data and logs to enable restoration after data loss. Strategies include full backups, differential backups, transaction log backups, and point-in-time recovery.
Database Backup and Recovery: Strategies for Data Protection
Database backup and recovery is the process of creating copies of database data and transaction logs to enable restoration after data loss, corruption, or disasters. Backups are the last line of defense against hardware failure, human error, software bugs, ransomware attacks, and natural disasters. Without reliable backups, data loss can be permanent and catastrophic for any organization.
A backup strategy defines what data to back up, how often, where to store backups, how long to retain them, and how to restore when needed. Recovery is the process of restoring data from backups. The two are inseparable. To understand backup and recovery properly, it is helpful to be familiar with database transactions, database replication, and disaster recovery.
┌─────────────────────────────────────────────────────────────┐
│ Backup and Recovery │
├─────────────────────────────────────────────────────────────┤
│ │
│ Backup Types: Recovery Metrics: │
│ • Full Backup • RPO (Recovery Point Objective) │
│ • Differential Backup • RTO (Recovery Time Objective) │
│ • Incremental Backup │
│ • Transaction Log Backup │
│ │
│ Storage Locations: Recovery Methods: │
│ • Local Disk • Full restore │
│ • External Drive • Point-in-time recovery │
│ • Cloud Storage • Page-level restore │
│ • Tape (Archive) • Piecemeal restore │
│ │
└─────────────────────────────────────────────────────────────┘
What Is Database Backup and Recovery
Database backup is the process of copying data from a database to a separate storage location to protect against data loss. Recovery is the process of restoring that data back to the database when needed. Together, they form a critical component of any data management strategy.
- Backup: A copy of database data and/or transaction logs at a specific point in time.
- Restore: The process of copying backup data back to the database.
- Recovery: The process of bringing a database to a consistent state after a failure.
- RPO (Recovery Point Objective): Maximum acceptable data loss measured in time. Determines backup frequency.
- RTO (Recovery Time Objective): Maximum acceptable downtime after a failure. Determines restore speed requirements.
- Point-in-Time Recovery (PITR): Restoring to a specific moment before a failure occurred.
Why Backup and Recovery Matters
Data is the most valuable asset for most organizations. Without proper backup and recovery, data loss can lead to financial ruin, legal liability, and permanent reputational damage.
- Hardware Failure: Disks crash, servers fail. Backups provide recovery when hardware fails.
- Human Error: Accidental deletions, incorrect updates, dropped tables. Backups undo mistakes.
- Ransomware and Malware: Malicious encryption of data. Clean backups are the only reliable recovery method.
- Software Bugs: Application bugs can corrupt data. Restore to a point before the corruption occurred.
- Natural Disasters: Fire, flood, earthquake. Offsite backups protect against site-wide disasters.
- Compliance Requirements: Regulations like GDPR, HIPAA, and PCI-DSS require data retention and recoverability.
- Audit and Legal Discovery: Historical data may be needed for investigations or legal proceedings.
Types of Backups
1. Full Backup
A full backup copies all data from the database. It is the most comprehensive but takes the most time and storage space. Restores are fastest because only one backup is needed.
# Using pg_dump (logical backup)
pg_dump mydb > mydb_full_backup.sql
# Using pg_basebackup (physical backup)
pg_basebackup -D /backup/full_backup -Ft -z
# MySQL full backup
mysqldump --all-databases > all_databases.sql
mysqldump mydb > mydb_full_backup.sql
# SQL Server full backup
BACKUP DATABASE mydb TO DISK = 'D:\backup\mydb_full.bak'
2. Differential Backup
A differential backup copies all data that has changed since the last full backup. It is smaller and faster than a full backup. Restore requires the last full backup plus the latest differential backup.
# SQL Server differential backup
BACKUP DATABASE mydb TO DISK = 'D:\backup\mydb_diff.bak' WITH DIFFERENTIAL
# PostgreSQL (using pg_basebackup with WAL archiving)
# Differential backups are typically managed via WAL archiving
3. Incremental Backup
An incremental backup copies only data that has changed since the last backup (full or incremental). It is the smallest and fastest backup type. Restore requires the full backup plus all subsequent incremental backups.
Backup Type | Size | Speed | Restore Complexity
─────────────────────────────────────────────────────
Full | Large| Slow | One file
Differential | Med | Med | Full + latest differential
Incremental | Small| Fast | Full + all incrementals
4. Transaction Log Backup
Transaction log backups capture all committed transactions since the last log backup. They enable point-in-time recovery (PITR), allowing restoration to any specific moment.
# SQL Server transaction log backup
BACKUP LOG mydb TO DISK = 'D:\backup\mydb_log.trn'
# PostgreSQL WAL archiving
archive_mode = on
archive_command = 'cp %p /backup/wal/%f'
# MySQL binary log (enables PITR)
log_bin = /var/log/mysql/mysql-bin.log
expire_logs_days = 7
Recovery Models
| Recovery Model | Description | Data Loss Risk | Storage Size |
|---|---|---|---|
| Simple | No transaction log backups. Only full/differential backups. | High (loss since last backup) | Low |
| Full | Full backups + transaction log backups. Supports point-in-time recovery. | Low (loss only of unlogged transactions) | High |
| Bulk-Logged (SQL Server) | Minimally logs bulk operations. Supports point-in-time recovery except bulk operations. | Medium | Medium |
-- Set to Full recovery model
ALTER DATABASE mydb SET RECOVERY FULL;
-- Set to Simple recovery model
ALTER DATABASE mydb SET RECOVERY SIMPLE;
-- Check current recovery model
SELECT name, recovery_model_desc FROM sys.databases;
Backup Strategies
Full + Differential + Transaction Log (Recommended for Production)
Schedule:
- Full backup: Weekly (Sunday 1 AM)
- Differential backup: Daily (Monday-Saturday 1 AM)
- Transaction log backup: Every 15-60 minutes
Restore scenario:
1. Restore full backup
2. Restore latest differential backup
3. Restore all transaction log backups after the differential
4. Recover to point of failure (point-in-time recovery)
RPO: 15-60 minutes (log backup frequency)
RTO: Hours (depending on data size)
Full + Transaction Log (For Critical Systems)
Schedule:
- Full backup: Daily (off-peak hours)
- Transaction log backup: Every 5-15 minutes
RPO: 5-15 minutes
RTO: Hours
Best for: Financial systems, e-commerce, healthcare
Full Only (For Small or Non-Critical Systems)
Schedule:
- Full backup: Daily
RPO: Up to 24 hours
RTO: Hours
Best for: Development databases, reporting databases, low-priority systems
Point-in-Time Recovery (PITR)
Point-in-time recovery allows restoring a database to a specific moment before a failure occurred. This is essential for recovering from accidental data deletion or corruption.
# PostgreSQL (using WAL archiving)
# Restore to specific time
pg_restore -Fc mydb_backup.dump | psql
# Apply WAL files up to desired time
# MySQL point-in-time recovery
mysqlbinlog --stop-datetime="2024-01-15 10:30:00" mysql-bin.000123 | mysql -u root -p
# SQL Server point-in-time recovery
RESTORE DATABASE mydb FROM DISK = 'D:\backup\mydb_full.bak' WITH NORECOVERY
RESTORE LOG mydb FROM DISK = 'D:\backup\mydb_log.trn' WITH STOPAT = '2024-01-15 10:30:00', RECOVERY
Backup Storage Locations
| Location | Pros | Cons | Best For |
|---|---|---|---|
| Same Server | Fastest restore, simple | Vulnerable to same failures (disk crash) | Temporary backups, not recommended |
| External Drive / NAS | Separate device, moderately fast | Vulnerable to same physical disaster | Small deployments, local redundancy |
| Cloud Storage (S3, GCS, Azure) | Durable, geographically redundant, scalable | Network latency for restore, ongoing cost | Most modern applications |
| Offsite Physical Media | Air-gapped, ransomware-resistant | Slow, manual process | Long-term archives, compliance |
| Backup as a Service (BaaS) | Managed, automated, easy setup | Cost, vendor lock-in | Teams without dedicated DBAs |
Backup Automation
#!/bin/bash
# /etc/cron.daily/postgres-backup
BACKUP_DIR="/backup/postgres"
DATE=$(date +%Y%m%d_%H%M%S)
DB_NAME="mydb"
S3_BUCKET="s3://my-backups/postgres"
# Create backup
pg_dump $DB_NAME | gzip > $BACKUP_DIR/${DB_NAME}_${DATE}.sql.gz
# Upload to S3
aws s3 cp $BACKUP_DIR/${DB_NAME}_${DATE}.sql.gz $S3_BUCKET/
# Delete backups older than 30 days
find $BACKUP_DIR -name "*.sql.gz" -mtime +30 -delete
# Keep last 7 daily backups in S3 (lifecycle policy)
#!/bin/bash
# Daily full backup
BACKUP_DIR="/backup/mysql"
DATE=$(date +%Y%m%d)
mysqldump --all-databases --single-transaction --routines --triggers \
> $BACKUP_DIR/full_backup_$DATE.sql
# Compress
gzip $BACKUP_DIR/full_backup_$DATE.sql
# Binary log backup (for PITR)
mysqladmin flush-logs
cp /var/log/mysql/mysql-bin.* $BACKUP_DIR/binlogs/
Backup Validation and Testing
An untested backup is not a backup. Regular validation ensures backups are usable when needed.
□ Verify backup file integrity (checksums)
□ Test restore on a separate server
□ Verify data consistency after restore
□ Test application connectivity to restored database
□ Document restore time (RTO validation)
□ Test point-in-time recovery
□ Verify that all required objects are restored
#!/bin/bash
# Weekly restore test
TEST_DB="restore_test_$(date +%Y%m%d)"
# Restore backup to test database
pg_restore -C -d postgres -O -x backup.dump
# Run consistency checks
psql -d $TEST_DB -c "SELECT COUNT(*) FROM users;"
psql -d $TEST_DB -c "ANALYZE;"
# Run application smoke tests
./test_connection.sh $TEST_DB
# Drop test database
psql -c "DROP DATABASE $TEST_DB;"
Restore Procedures
Full Restore
# PostgreSQL
pg_restore -d mydb -C mydb_backup.dump
# MySQL
mysql -u root -p < mydb_backup.sql
# SQL Server
RESTORE DATABASE mydb FROM DISK = 'D:\backup\mydb_full.bak' WITH REPLACE
Point-in-Time Restore
# PostgreSQL (using WAL)
1. Restore base backup
2. Configure recovery.conf
3. Set recovery_target_time = '2024-01-15 10:30:00'
4. Start PostgreSQL
# MySQL
mysqlbinlog --stop-datetime="2024-01-15 10:30:00" binlog.000123 | mysql
Restore to Different Server
# PostgreSQL
pg_restore -h new-server -p 5432 -d mydb mydb_backup.dump
# MySQL
mysql -h new-server -u root -p < mydb_backup.sql
Common Backup Mistakes to Avoid
- No Offsite Backups: Local backups only are destroyed in site-wide disasters.
- Untested Backups: Backups that are never tested often fail when needed most.
- No Backup Monitoring: Failed backups go unnoticed without monitoring and alerts.
- Single Backup Copy: One backup is not enough. Follow 3-2-1 rule.
- No Transaction Log Backups: Without log backups, point-in-time recovery is impossible.
- Backup Encryption Missing: Unencrypted backups expose sensitive data if stolen.
- Too Long RPO: Infrequent backups cause unacceptable data loss.
- No Retention Policy: Backups kept forever consume storage; deleted too soon lose history.
Backup Best Practices
- Follow 3-2-1 Backup Rule: 3 copies of data, 2 different media types, 1 copy offsite.
- Automate Backups: Manual backups are forgotten. Automate with scripts and schedulers.
- Monitor Backup Success/Failure: Set up alerts for backup failures.
- Test Restores Regularly: Restore to a test environment monthly.
- Encrypt Backups: Use encryption at rest and in transit.
- Define Retention Policy: Daily for 7 days, weekly for 4 weeks, monthly for 12 months, yearly for 7 years.
- Document Restore Procedures: Write step-by-step restore documentation and test it.
- Use Compression: Compress backups to save storage and transfer time.
Copy 1: Primary database (original)
Copy 2: Local backup (external drive or NAS)
Copy 3: Offsite backup (cloud storage)
Media types:
- Primary: SSD/HDD
- Local backup: External HDD (different device)
- Offsite: Cloud storage (different media and location)
Frequently Asked Questions
- What is the difference between backup and replication?
Backup creates point-in-time copies for recovery. Replication continuously copies data for high availability and read scaling. Backups protect against logical corruption and human error; replication does not. - How often should I back up my database?
Frequency depends on RPO. Mission-critical data: log backups every 5-15 minutes. Business-critical: daily full + hourly logs. Non-critical: daily full only. - How long should I keep backups?
Typical retention: daily for 7 days, weekly for 4 weeks, monthly for 12 months, yearly for 7 years. Follow regulatory requirements for your industry. - What is the difference between logical and physical backups?
Logical backups (pg_dump, mysqldump) export SQL statements. Physical backups copy database files directly. Physical backups are faster for large databases; logical backups are more portable. - Can I back up a live database?
Yes, using hot backups. PostgreSQL uses pg_dump with --single-transaction. MySQL uses --single-transaction. SQL Server uses VSS. Always test that backups are consistent. - What should I learn next after backup and recovery?
After mastering backup and recovery, explore disaster recovery planning, database replication for HA, advanced PITR, and backup encryption for complete data protection.
