Types of Database Indexes: B-Tree, Hash, Bitmap, and Full-Text

Different database index types serve different purposes. B-Tree indexes are general-purpose, Hash indexes are optimal for equality searches, Bitmap indexes work well for low-cardinality columns, and Full-Text indexes enable text search.

Types of Database Indexes: B-Tree, Hash, Bitmap, and Full-Text

Not all database indexes are created equal. Different index types are optimized for different query patterns and data characteristics. Choosing the right index type for your specific workload can dramatically improve query performance, while choosing the wrong type can waste storage and provide no benefit. Understanding the strengths and weaknesses of each index type is essential for any database performance optimization.

The most common index types are B-Tree (the default in most databases), Hash (optimal for equality lookups), Bitmap (efficient for low-cardinality columns), and Full-Text (designed for text search). Each serves a distinct purpose. To understand index types properly, it is helpful to be familiar with database indexing basics, SQL query optimization, and relational database design.

Index types overview:
Index Type    | Best For                          | Not Good For
─────────────────────────────────────────────────────────────────────
B-Tree        | Range queries, sorting, equality  | None (most versatile)
Hash          | Exact equality (=) only           | Range (>), sorting, LIKE
Bitmap        | Low cardinality, multiple AND/OR  | High cardinality, writes
Full-Text     | Text search, keywords, relevance  | Exact match, numbers
GIN (PG)      | Array, JSON, full-text            | Simple equality
BRIN (PG)     | Very large tables, natural order  | Random order, small tables

B-Tree Indexes

B-Tree (Balanced Tree) indexes are the default and most common index type in virtually all relational databases. They maintain sorted data structure that allows efficient equality searches, range queries, and sorted retrieval. B-Tree indexes work well for most query patterns and are the best choice when you are unsure which index type to use.

B-Tree index structure:
                    [50]
                   /    \
               [20,30]   [70,80]
              /   |   \   /   |   \
            [10] [25] [35] [60] [75] [90]

Properties:
- Balanced tree (all leaves at same depth)
- O(log n) search time
- Maintains sorted order
- Supports: =, >, <, >=, <=, BETWEEN, LIKE 'prefix%'

When to Use B-Tree Indexes

  • Primary key and unique constraints: B-Tree is the default index type for primary keys.
  • Range queries: WHERE salary > 50000, WHERE date BETWEEN '2024-01-01' AND '2024-12-31'.
  • Sorting: ORDER BY column. The index returns data in sorted order, eliminating the need for a separate sort operation.
  • Equality searches: WHERE email = 'user@example.com'.
  • Prefix searches: WHERE name LIKE 'John%' (trailing wildcard).
  • Most general-purpose indexing needs.
B-Tree index creation examples:
-- Default B-Tree (most databases)
CREATE INDEX idx_users_email ON users(email);

-- PostgreSQL explicit B-Tree
CREATE INDEX idx_users_email ON users USING BTREE(email);

-- Composite B-Tree (order matters!)
CREATE INDEX idx_orders_status_date ON orders(status, created_at);

-- Supports these queries efficiently:
SELECT * FROM users WHERE email = 'john@example.com';           -- Equality
SELECT * FROM orders WHERE created_at > '2024-01-01';           -- Range
SELECT * FROM orders WHERE status = 'pending' ORDER BY created_at; -- Sorting

Hash Indexes

Hash indexes use a hash table structure that maps keys directly to row locations. They are extremely fast for exact equality comparisons (=) but cannot support range queries, sorting, or pattern matching. Hash indexes are ideal for lookup tables and high-cardinality columns where only equality searches are needed.

Hash index structure:
Hash function: key → bucket number

key "john@example.com" → hash(12345) → bucket 5 → row pointer
key "jane@example.com" → hash(67890) → bucket 2 → row pointer

Properties:
- O(1) lookup time (constant, very fast)
- Cannot be used for range queries (> , < , BETWEEN)
- Cannot be used for sorting (ORDER BY)
- Cannot be used for LIKE patterns
- Smaller than B-Tree (no sorted structure)

When to Use Hash Indexes

  • Lookup tables: Dictionary tables, status codes, type lookups.
  • High-cardinality equality searches: WHERE user_id = 12345, WHERE session_id = 'abc...'.
  • In-memory databases: Redis, Memcached use hash structures.
  • When range queries and sorting are never needed.
  • PostgreSQL specifically: Hash indexes are now write-ahead logged and crash-safe (since PostgreSQL 10).
Hash index creation examples:
-- PostgreSQL hash index
CREATE INDEX idx_users_email_hash ON users USING HASH(email);

-- MySQL (MEMORY tables only, automatically uses hash)
CREATE TABLE lookup (
    id INT,
    value VARCHAR(100),
    INDEX USING HASH (id)
) ENGINE=MEMORY;

-- B-Tree vs Hash performance comparison
-- Equality search: Hash is faster (O(1) vs O(log n))
-- Range search: B-Tree works, Hash does not
Hash index limitations example:
-- These queries CAN use hash index
SELECT * FROM users WHERE email = 'john@example.com';
SELECT * FROM users WHERE user_id = 12345;

-- These queries CANNOT use hash index
SELECT * FROM users WHERE email > 'john@example.com';  -- Range
SELECT * FROM users WHERE email LIKE 'john%';         -- Pattern
SELECT * FROM users ORDER BY email;                    -- Sorting

Bitmap Indexes

Bitmap indexes store a bitmap (string of bits) for each distinct value in a column, with one bit per row indicating whether that row contains the value. They are extremely efficient for low-cardinality columns (few distinct values) and complex combinations of AND/OR conditions. Bitmap indexes are commonly used in data warehouses and reporting systems.

Bitmap index structure:
Column: status (values: active, inactive, pending)

status = 'active'   →  bitmap: 1 0 1 1 0  (row1=active, row2=not, row3=active...)
status = 'inactive' →  bitmap: 0 1 0 0 1
status = 'pending'  →  bitmap: 0 0 0 0 0

Query: WHERE status = 'active' AND region = 'US'
Result: Bitwise AND of both bitmaps

Properties:
- Extremely space-efficient for low cardinality
- Fast bitwise operations (AND, OR, NOT)
- Ideal for data warehousing (read-heavy)
- Poor for high-cardinality columns
- Slower for writes (must update bitmaps)

When to Use Bitmap Indexes

  • Low-cardinality columns: status (active/inactive), gender (M/F), region (US/EU/ASIA).
  • Data warehouses: Star schemas with dimension tables.
  • Complex WHERE clauses: Multiple conditions combined with AND/OR.
  • Read-only or read-heavy tables: Bitmap indexes have write overhead.
  • Oracle, PostgreSQL (with extension), and some data warehouses.
Bitmap index creation examples:
-- Oracle bitmap index
CREATE BITMAP INDEX idx_customers_status ON customers(status);

-- PostgreSQL (requires extension)
CREATE EXTENSION pg_bitmap;
CREATE INDEX idx_customers_status ON customers USING bitmap(status);

-- Bitmap index excels at multiple conditions
SELECT COUNT(*) FROM sales
WHERE region = 'US' 
  AND product_category = 'Electronics'
  AND sale_date BETWEEN '2024-01-01' AND '2024-12-31';
Bitmap index cardinality comparison:
Good for Bitmap (low cardinality):
- status (3 values: active, inactive, pending)
- gender (2 values: M, F)
- region (5 values: US, EU, ASIA, SA, AFRICA)

Bad for Bitmap (high cardinality):
- email (millions of unique values)
- user_id (unique per row)
- timestamp (unique per second)

Full-Text Indexes

Full-Text indexes are designed for searching large amounts of text data. They break text into tokens (words), normalize them (lowercase, remove stop words, apply stemming), and build an inverted index mapping words to document locations. Full-Text indexes enable fast keyword search, relevance ranking, and linguistic features like stemming and thesaurus.

Full-Text index structure:
Documents:
Doc1: "The quick brown fox"
Doc2: "The lazy dog"
Doc3: "A quick rabbit"

Inverted Index:
"quick"  → Doc1, Doc3
"brown"  → Doc1
"fox"    → Doc1
"lazy"   → Doc2
"dog"    → Doc2
"rabbit" → Doc3

Query: "quick"
Result: Doc1, Doc3 (relevance ranked)

When to Use Full-Text Indexes

  • Search functionality: Product search, blog search, document search.
  • Keyword matching: Finding documents containing specific words.
  • Relevance ranking: Ordering results by how well they match the search terms.
  • Natural language queries: Human-readable search phrases.
  • Large text columns: VARCHAR, TEXT, CLOB columns with substantial content.
Full-Text index creation examples:
-- MySQL
CREATE TABLE articles (
    id INT PRIMARY KEY,
    title VARCHAR(200),
    content TEXT,
    FULLTEXT INDEX ft_articles (title, content)
) ENGINE=InnoDB;

SELECT * FROM articles 
WHERE MATCH(title, content) AGAINST('database indexing' IN NATURAL LANGUAGE MODE);

-- PostgreSQL
CREATE TABLE articles (
    id INT PRIMARY KEY,
    title VARCHAR(200),
    content TEXT
);

CREATE INDEX idx_articles_content ON articles USING GIN(to_tsvector('english', content));

SELECT * FROM articles 
WHERE to_tsvector('english', content) @@ to_tsquery('database & indexing');

-- SQL Server
CREATE FULLTEXT CATALOG ft_catalog AS DEFAULT;
CREATE FULLTEXT INDEX ON articles(title, content) KEY INDEX pk_articles;
LIKE vs Full-Text performance comparison:
-- Slow: LIKE with leading wildcard (cannot use B-Tree)
SELECT * FROM articles WHERE content LIKE '%database%';  -- Full table scan

-- Fast: Full-Text search (uses inverted index)
SELECT * FROM articles 
WHERE MATCH(content) AGAINST('database');  -- Uses Full-Text index

-- Full-Text with ranking
SELECT *, MATCH(content) AGAINST('database indexing') AS relevance
FROM articles
WHERE MATCH(content) AGAINST('database indexing')
ORDER BY relevance DESC;

Specialized Index Types

GIN (Generalized Inverted Index) - PostgreSQL

GIN indexes are designed for composite data types like arrays, JSON, and full-text search. They store mappings from element values to rows containing them.

-- GIN index on JSONB
CREATE INDEX idx_products_metadata ON products USING GIN(metadata);

-- Query JSON efficiently
SELECT * FROM products WHERE metadata @> '{"color": "red"}';

-- GIN index on array
CREATE INDEX idx_users_tags ON users USING GIN(tags);

SELECT * FROM users WHERE tags @> ARRAY['developer'];

BRIN (Block Range Index) - PostgreSQL

BRIN indexes store summary information about blocks of rows, making them extremely compact for very large tables with naturally ordered data (like timestamps).

-- BRIN index on timestamp (very small index)
CREATE INDEX idx_events_created_at ON events USING BRIN(created_at);

-- Great for billions of rows where data is inserted in time order
-- Index size: kilobytes vs megabytes for B-Tree
SELECT * FROM events WHERE created_at BETWEEN '2024-01-01' AND '2024-01-31';

Index Type Comparison Summary

Index Type Query Types Supported Best For Storage Size Write Overhead
B-Tree =, >, <, >=, <=, BETWEEN, LIKE 'prefix%', ORDER BY General purpose, most queries Medium Medium
Hash = only High-cardinality equality, lookup tables Small Low
Bitmap =, AND, OR, NOT (multiple conditions) Low-cardinality, data warehouses Very Small High
Full-Text Keyword search, relevance ranking, natural language Text search, document retrieval Large High
GIN (PG) JSON, array, full-text, many-to-many Composite types, document stores Large High
BRIN (PG) Range queries on naturally ordered data Very large tables, time-series data Very Small Low

Choosing the Right Index Type

  • Start with B-Tree: It is the default for good reason. Use B-Tree unless you have a specific reason to choose another type.
  • Use Hash for high-cardinality equality: When you only need exact matches and never range queries, hash indexes can be faster.
  • Use Bitmap for low-cardinality columns: Especially in data warehouses with complex AND/OR conditions.
  • Use Full-Text for text search: Never use LIKE '%term%' on large text columns. Full-Text indexes are purpose-built for search.
  • Use GIN for JSON/array: When querying inside JSON documents or arrays, GIN is the right choice.
  • Use BRIN for time-series: Billions of rows with timestamp data? BRIN can save enormous storage.

Common Index Type Mistakes to Avoid

  • Using Hash for range queries: Hash indexes cannot handle >, <, or BETWEEN. Use B-Tree.
  • Using Bitmap on high-cardinality columns: Bitmap on unique columns is as large as the table itself. Use B-Tree.
  • Using Full-Text for exact matches: Full-Text is for keyword search. Use B-Tree for exact string matches.
  • Not matching index type to query pattern: Analyze your queries before choosing index types.
  • Over-indexing: Each index adds write overhead. Choose only the types you need.

Frequently Asked Questions

  1. Which index type is fastest for equality searches?
    Hash indexes are theoretically fastest for equality (O(1)), but B-Tree (O(log n)) is usually fast enough. Hash indexes are only beneficial when you have millions of lookups per second.
  2. Can I have multiple index types on the same table?
    Yes. A table can have B-Tree indexes on some columns, Full-Text indexes on text columns, and Bitmap indexes on low-cardinality columns.
  3. Does my database support all index types?
    Not all databases support all index types. PostgreSQL supports the widest variety. MySQL supports B-Tree, Hash (MEMORY tables), and Full-Text. SQL Server supports B-Tree (clustered/non-clustered), Full-Text, and Columnstore.
  4. When should I use a covering index?
    A covering index (not a separate type, but a strategy) includes all columns needed by a query. This allows index-only scans, which are the fastest possible.
  5. What is the difference between clustered and non-clustered B-Tree indexes?
    Clustered index determines the physical order of data (one per table). Non-clustered indexes are separate structures pointing to the data (many per table). Clustered is faster for range scans.
  6. What should I learn next after index types?
    After mastering index types, explore database indexing strategies, query execution plans, SQL optimization, and database performance tuning for complete mastery.