Database Indexing Strategies

Indexing is often the difference between a responsive application and one that crawls. Let me share the strategies I've used to optimize database performance.

Why Indexes Matter

Without indexes, every query scans the entire table. With 1 million rows, that's expensive. With proper indexes, you can drop query time from seconds to milliseconds.

B-Tree Indexes

The most common index type. Perfect for range queries and sorting:

CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);

Good for:

Equality lookups: WHERE email = 'user@example.com'
Range queries: WHERE created_at > '2024-01-01'
Sorting: ORDER BY created_at

Bad for:

Full-text search
Complex pattern matching

Composite Indexes

Create indexes on multiple columns when you have multi-column WHERE clauses:

-- Good for: WHERE user_id = ? AND status = ?
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- Not good for: WHERE status = ? AND user_id = ?
-- (Different column order matters!)

Column order matters: Put the most selective column first.

Covering Indexes

Include extra columns in the index so the database never needs to look at the table:

-- Without covering index: 
-- 1. Find user_id in index
-- 2. Fetch row from table to get email, name

-- With covering index:
CREATE INDEX idx_users_active_email 
  ON users(is_active, email) 
  INCLUDE (name, phone);

-- Now all data is in the index
SELECT email, name, phone FROM users WHERE is_active = true;

When NOT to Index

Common mistakes:

Over-indexing: More indexes = slower writes
Unused indexes: Monitor and remove them
Redundant indexes: (a, b) makes (a) unnecessary
Indexing low-cardinality columns: CREATE INDEX idx_gender usually useless

Monitoring and Analysis

-- Find slow queries (MySQL)
SELECT * FROM mysql.slow_log;

-- Analyze query plans
EXPLAIN SELECT * FROM orders WHERE user_id = 100 AND created_at > '2024-01-01';

-- Find unused indexes
SELECT object_schema, object_name, COUNT(*)
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
GROUP BY object_schema, object_name
HAVING COUNT(*) = 0;

Practical Example

Product search was slow:

-- Before
SELECT * FROM products 
WHERE category_id = ? AND price > ? AND rating > ?
ORDER BY created_at DESC;

-- EXPLAIN shows: full table scan

-- After
CREATE INDEX idx_products_search 
  ON products(category_id, price, rating, created_at DESC);

-- EXPLAIN now shows: index range scan
-- Query time: 5s → 50ms

Conclusion

Good indexing isn't rocket science, but it requires understanding your access patterns. Profile your queries, measure before and after, and don't over-index.

Always remember: The fastest query is the one that doesn't run.