Database Indexing Strategies
Learn how to design indexes that dramatically improve query performance without bloating storage.
Indexing is often the difference between a responsive application and one that crawls. Let me share the strategies I've used to optimize database performance.
Why Indexes Matter
Without indexes, every query scans the entire table. With 1 million rows, that's expensive. With proper indexes, you can drop query time from seconds to milliseconds.
B-Tree Indexes
The most common index type. Perfect for range queries and sorting:
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
Good for:
- Equality lookups:
WHERE email = 'user@example.com' - Range queries:
WHERE created_at > '2024-01-01' - Sorting:
ORDER BY created_at
Bad for:
- Full-text search
- Complex pattern matching
Composite Indexes
Create indexes on multiple columns when you have multi-column WHERE clauses:
-- Good for: WHERE user_id = ? AND status = ?
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Not good for: WHERE status = ? AND user_id = ?
-- (Different column order matters!)
Column order matters: Put the most selective column first.
Covering Indexes
Include extra columns in the index so the database never needs to look at the table:
-- Without covering index:
-- 1. Find user_id in index
-- 2. Fetch row from table to get email, name
-- With covering index:
CREATE INDEX idx_users_active_email
ON users(is_active, email)
INCLUDE (name, phone);
-- Now all data is in the index
SELECT email, name, phone FROM users WHERE is_active = true;
When NOT to Index
Common mistakes:
- Over-indexing: More indexes = slower writes
- Unused indexes: Monitor and remove them
- Redundant indexes:
(a, b)makes(a)unnecessary - Indexing low-cardinality columns:
CREATE INDEX idx_genderusually useless
Monitoring and Analysis
-- Find slow queries (MySQL)
SELECT * FROM mysql.slow_log;
-- Analyze query plans
EXPLAIN SELECT * FROM orders WHERE user_id = 100 AND created_at > '2024-01-01';
-- Find unused indexes
SELECT object_schema, object_name, COUNT(*)
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
GROUP BY object_schema, object_name
HAVING COUNT(*) = 0;
Practical Example
Product search was slow:
-- Before
SELECT * FROM products
WHERE category_id = ? AND price > ? AND rating > ?
ORDER BY created_at DESC;
-- EXPLAIN shows: full table scan
-- After
CREATE INDEX idx_products_search
ON products(category_id, price, rating, created_at DESC);
-- EXPLAIN now shows: index range scan
-- Query time: 5s → 50ms
Conclusion
Good indexing isn't rocket science, but it requires understanding your access patterns. Profile your queries, measure before and after, and don't over-index.
Always remember: The fastest query is the one that doesn't run.