Primary Key and Foreign Key: Understanding Database Keys
A primary key uniquely identifies each row in a table. A foreign key is a column that references the primary key of another table, creating a relationship between tables and enforcing referential integrity.
Primary Key and Foreign Key: Understanding Database Keys
Primary keys and foreign keys are the fundamental building blocks of relational database design. A primary key uniquely identifies each row in a table, ensuring that every record can be referenced individually. A foreign key creates a link between tables, referencing the primary key of another table to establish relationships and enforce data integrity. Without keys, relational databases would simply be collections of unrelated tables.
Understanding keys is essential for anyone working with databases. They determine how tables relate to each other, how data can be joined, and what constraints the database can enforce automatically. To understand keys properly, it is helpful to be familiar with database basics, relational database design, and entity relationship diagrams.
┌─────────────────────────────────────────────────────────────────┐
│ PRIMARY KEY │
│ • Uniquely identifies each row │
│ • Cannot be NULL │
│ • Only one per table │
│ • Creates clustered index by default (in most databases) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ FOREIGN KEY │
│ • References primary key of another table │
│ • Can be NULL (optional relationship) │
│ • Multiple per table allowed │
│ • Enforces referential integrity │
└─────────────────────────────────────────────────────────────────┘
Example:
customers (customer_id PRIMARY KEY)
│
│ FOREIGN KEY (customer_id)
▼
orders (order_id PRIMARY KEY, customer_id)
What Is a Primary Key
A primary key is a column or set of columns that uniquely identifies each row in a table. Every table should have a primary key. The primary key guarantees that rows are unique and provides a stable way to reference specific records.
- Uniqueness: No two rows can have the same primary key value.
- Non-NULL: Primary key columns cannot contain NULL values.
- Immutable: Primary key values should never change after they are assigned.
- Single per table: A table can have only one primary key (though it can consist of multiple columns).
- Indexed: Primary keys are automatically indexed for fast lookups.
-- Single column primary key (most common)
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(255)
);
-- Composite primary key (multiple columns)
CREATE TABLE order_items (
order_id INT,
product_id INT,
quantity INT,
PRIMARY KEY (order_id, product_id)
);
-- Natural primary key (using business value)
CREATE TABLE countries (
country_code CHAR(2) PRIMARY KEY, -- ISO country code
country_name VARCHAR(100)
);
Types of Primary Keys
| Type | Description | Example | Pros | Cons |
|---|---|---|---|---|
| Surrogate Key | Artificial key with no business meaning | Auto-incrementing integer, UUID | Never changes, simple, fast | No business meaning, extra column |
| Natural Key | Key derived from business data | Email address, product code, ISBN | Has meaning, no extra column | Can change over time, may not be unique |
| Composite Key | Key made of multiple columns | (order_id, product_id) in order_items | Natural for junction tables | More complex joins, larger indexes |
-- Surrogate key (recommended for most tables)
CREATE TABLE users (
user_id INT PRIMARY KEY AUTO_INCREMENT,
email VARCHAR(255) UNIQUE,
name VARCHAR(100)
);
-- Natural key (avoid when values can change)
CREATE TABLE users (
email VARCHAR(255) PRIMARY KEY, -- Email can change!
name VARCHAR(100)
);
-- Problem: If a user changes email, all foreign keys referencing it must update
What Is a Foreign Key
A foreign key is a column or set of columns that references the primary key of another table. Foreign keys create relationships between tables and enforce referential integrity, ensuring that a value in one table exists in another.
- Referential Integrity: Ensures that a foreign key value always matches an existing primary key value in the referenced table.
- Can be NULL: Foreign keys can be NULL to represent optional relationships.
- Multiple per table: A table can have many foreign keys, each referencing a different table.
- Not automatically indexed: Foreign keys should be indexed manually for join performance.
- Self-referencing: A foreign key can reference the same table (hierarchical data).
-- Basic foreign key
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
-- Foreign key with ON DELETE action
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
ON DELETE CASCADE
);
-- Composite foreign key
CREATE TABLE order_items (
order_id INT,
product_id INT,
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (order_id) REFERENCES orders(order_id),
FOREIGN KEY (product_id) REFERENCES products(product_id)
);
Referential Integrity Actions
When a referenced row is updated or deleted, foreign key constraints define what happens to the referencing rows.
| Action | Description | Use Case |
|---|---|---|
| CASCADE | Delete or update child rows when parent changes | OrderItems should be deleted when Order is deleted |
| SET NULL | Set foreign key to NULL when parent is deleted | Optional relationship where child can exist without parent |
| RESTRICT / NO ACTION | Prevent deletion if child rows exist | Default behavior, prevents orphaned records |
| SET DEFAULT | Set foreign key to default value when parent is deleted | Less common, used with specific default values |
-- CASCADE: Delete order items when order is deleted
FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE
-- SET NULL: Set customer_id to NULL if customer is deleted
FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ON DELETE SET NULL
-- RESTRICT: Prevent deletion of customer who has orders (default)
FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ON DELETE RESTRICT
Primary Key vs Foreign Key
| Feature | Primary Key | Foreign Key |
|---|---|---|
| Purpose | Uniquely identifies each row | Links tables and enforces relationships |
| Uniqueness | Must be unique | Can have duplicate values |
| NULL values | Not allowed | Allowed (optional relationships) |
| Number per table | Only one | Many |
| Index | Automatically indexed | Not automatically indexed (index manually) |
| Can change | Should never change | Can change (with CASCADE) |
| References | References no other table | References primary key of another table |
Self-Referencing Foreign Keys
A self-referencing foreign key references the primary key of the same table. This is useful for hierarchical data like organizational charts or category trees.
-- Employee table with manager relationship (self-referencing)
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
manager_id INT,
FOREIGN KEY (manager_id) REFERENCES employees(employee_id)
);
-- Data example:
-- employee_id: 1, name: 'CEO', manager_id: NULL
-- employee_id: 2, name: 'VP Eng', manager_id: 1
-- employee_id: 3, name: 'Developer', manager_id: 2
-- Query to get employee hierarchy
SELECT e1.name AS employee, e2.name AS manager
FROM employees e1
LEFT JOIN employees e2 ON e1.manager_id = e2.employee_id;
Composite Keys (Primary and Foreign)
A composite key uses multiple columns to uniquely identify a row. Composite foreign keys must reference composite primary keys with matching column order and data types.
-- Composite primary key
CREATE TABLE order_items (
order_id INT,
product_id INT,
quantity INT,
PRIMARY KEY (order_id, product_id)
);
-- Composite foreign key (must match column order)
CREATE TABLE shipment_items (
shipment_id INT,
order_id INT,
product_id INT,
quantity_shipped INT,
PRIMARY KEY (shipment_id, order_id, product_id),
FOREIGN KEY (order_id, product_id)
REFERENCES order_items(order_id, product_id)
);
Choosing a Primary Key Strategy
- Use Surrogate Keys by Default: Auto-incrementing integers or UUIDs are recommended for most tables. They never change and are simple to work with.
- Avoid Natural Keys That Can Change: Emails, usernames, and addresses change over time. Using them as primary keys causes cascade updates.
- Use Natural Keys Only for Stable, Unique Values: ISO country codes, product SKUs that never change, or ISBN numbers are good candidates.
- Use Composite Keys for Junction Tables: Many-to-many relationship tables naturally use composite keys of the two foreign keys.
- Consider UUID for Distributed Systems: When multiple databases generate IDs independently, UUIDs avoid collisions.
-- PostgreSQL
CREATE TABLE users (
user_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(100)
);
-- MySQL
CREATE TABLE users (
user_id CHAR(36) PRIMARY KEY DEFAULT (UUID()),
name VARCHAR(100)
);
-- Insert without specifying ID (auto-generated)
INSERT INTO users (name) VALUES ('John Doe');
Common Key-Related Mistakes to Avoid
- No Primary Key: Tables without primary keys cannot be reliably referenced and may contain duplicate rows.
- Using Natural Keys That Change: Email addresses as primary keys cause problems when users change emails.
- Missing Foreign Key Indexes: Foreign key columns used in JOINs need indexes for performance.
- Incorrect ON DELETE Action: Using CASCADE when you meant RESTRICT can accidentally delete important data.
- Using VARCHAR for Primary Keys: String primary keys are larger and slower than integers.
- Composite Key Column Order: Order matters for both indexing and foreign key references.
Best Practices for Keys
- Every Table Should Have a Primary Key: No exceptions. Even junction tables need composite primary keys.
- Use Surrogate Keys for Most Tables: Auto-increment integers are simple, fast, and never change.
- Index Foreign Keys: Foreign key columns used in JOINs should have indexes.
- Use Meaningful Constraint Names: Name your foreign key constraints so they are identifiable in error messages.
- Be Consistent: Use the same primary key strategy across your database (e.g., all INT or all UUID).
- Document Key Decisions: Document why you chose certain natural keys or composite keys.
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
CONSTRAINT fk_orders_customers
FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
ON DELETE RESTRICT
);
-- Error message will reference 'fk_orders_customers'
Frequently Asked Questions
- Can a table have multiple primary keys?
No. A table can have only one primary key. However, that primary key can consist of multiple columns (composite primary key). - Can a foreign key reference a non-primary key?
Yes, but only if that column has a UNIQUE constraint. Foreign keys typically reference primary keys, but they can reference any unique column. - What is the difference between PRIMARY KEY and UNIQUE?
Both enforce uniqueness, but a table can have only one PRIMARY KEY. UNIQUE constraints allow NULL values (depending on database), while PRIMARY KEY does not. - Should I use INT or UUID for primary keys?
INT is smaller and faster, suitable for most applications. UUID is better for distributed systems where multiple databases generate IDs independently or when you need to prevent ID guessing. - What happens if I delete a row that is referenced by a foreign key?
It depends on the ON DELETE action. CASCADE deletes child rows. SET NULL sets foreign keys to NULL. RESTRICT prevents deletion. - What should I learn next after primary and foreign keys?
After mastering keys, explore relational database design, entity relationship diagrams, database normalization, and SQL JOINs for complete database mastery.
