Backend

Database Indexing Explained: The 20-Minute Guide That Saves Hours of Debugging

Your queries are slow because you're missing indexes. Here's everything you need to know about database indexing — with real EXPLAIN ANALYZE outputs and before/after benchmarks.

PMML Engineering · Studio 9 June 2026 9 min read 0 views

Database Indexing Explained: The 20-Minute Guide That Saves Hours of Debugging

If your app is slow, it's probably your database. And if your database is slow, it's probably missing indexes. This guide will make you dangerous with database performance in 20 minutes.

The Analogy That Makes It Click

A database without indexes is like a library where books are stacked randomly on the floor. Want to find "The Great Gatsby"? Walk through every single book until you find it.

An index is the card catalog. It tells you exactly where to look.

Database architecture

Your First Index: Before and After

-- A table with 1 million rows, no index on email
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'alice@example.com';

-- Result: Seq Scan on users
-- Planning time: 0.1ms
-- Execution time: 342ms  ← Scanned all 1M rows! 😱

Now add an index:

CREATE INDEX idx_users_email ON users (email);

EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'alice@example.com';

-- Result: Index Scan using idx_users_email
-- Planning time: 0.1ms
-- Execution time: 0.05ms  ← 6,840x faster! ⚡

342ms → 0.05ms. One line of SQL. That's the power of indexing.

The 5 Types of Indexes You Need to Know

1. B-Tree Index (Default — Use This Most)

-- Good for: equality (=) and range (<, >, BETWEEN) queries
CREATE INDEX idx_orders_date ON orders (created_at);

-- Now these are fast:
SELECT * FROM orders WHERE created_at > '2026-01-01';
SELECT * FROM orders WHERE created_at BETWEEN '2026-01-01' AND '2026-03-31';
SELECT * FROM orders ORDER BY created_at DESC LIMIT 20;

2. Composite Index (Multi-Column)

-- Index column ORDER matters! (leftmost prefix rule)
CREATE INDEX idx_orders_user_date ON orders (user_id, created_at);

-- ✅ Uses index (matches left prefix)
SELECT * FROM orders WHERE user_id = 'abc123';
SELECT * FROM orders WHERE user_id = 'abc123' AND created_at > '2026-01-01';

-- ❌ Cannot use index (skips first column)
SELECT * FROM orders WHERE created_at > '2026-01-01';

Rule of thumb: Put the most selective (most unique) column first.

3. Partial Index (Only Index Relevant Rows)

-- Only 2% of orders are "pending" — why index all of them?
CREATE INDEX idx_orders_pending ON orders (created_at)
  WHERE status = 'pending';

-- This is tiny and lightning fast for the queries that matter
SELECT * FROM orders WHERE status = 'pending' ORDER BY created_at;

Performance dashboard

4. GIN Index (Full-Text Search & JSON)

-- For JSONB queries
CREATE INDEX idx_users_metadata ON users USING GIN (metadata);

-- Now JSONB queries are fast
SELECT * FROM users WHERE metadata @> '{"plan": "pro"}';

-- For full-text search
CREATE INDEX idx_posts_search ON posts USING GIN (to_tsvector('english', title || ' ' || body));

SELECT * FROM posts
WHERE to_tsvector('english', title || ' ' || body) @@ plainto_tsquery('typescript generics');

5. Unique Index (Data Integrity + Speed)

-- Enforces uniqueness AND provides an index for lookups
CREATE UNIQUE INDEX idx_users_email_unique ON users (email);

-- Also works for "soft unique" with partial indexes
CREATE UNIQUE INDEX idx_active_subscription
  ON subscriptions (user_id)
  WHERE status = 'active';
-- Each user can only have ONE active subscription

How to Find Missing Indexes

-- PostgreSQL: Find slow queries
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;

-- Find tables with lots of sequential scans (missing indexes)
SELECT relname, seq_scan, seq_tup_read, idx_scan
FROM pg_stat_user_tables
WHERE seq_scan > 100
ORDER BY seq_tup_read DESC;

-- Find unused indexes (wasting write performance)
SELECT indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE 'pg_%';

The Indexing Checklist

Before deploying any new feature, check:

[ ] Every WHERE clause column has an index
[ ] Every JOIN column has an index on both sides
[ ] Every ORDER BY column used with LIMIT has an index
[ ] Foreign keys have indexes (Postgres doesn't auto-create them!)
[ ] No unused indexes wasting write performance
[ ] Composite indexes match your query patterns

Optimized infrastructure

Common Mistakes

Indexing everything — each index slows down writes. Only index what you query.
Wrong column order in composite indexes — leftmost prefix rule matters.
Missing foreign key indexes — Postgres does NOT auto-index FKs.
Not using EXPLAIN ANALYZE — always verify the index is actually used.
Indexing low-cardinality columns — an index on a boolean column rarely helps.

The One Rule

If a query is slow, run EXPLAIN ANALYZE. The answer is almost always: add an index.

Bookmark this guide. The next time your app is slow, come back here before doing anything else.

#database#postgres#performance#backend