Skip to main content
Back to Articles

SQL Query Optimization: Indexes, Execution Plans & Performance (Complete Guide)

Master SQL optimization, use indexes correctly, read execution plans, speed up queries 4000x on 1M rows.

March 7, 20269 min readBy Mathematicon

SQL Query Optimization: Indexes, Execution Plans & Performance (Complete Guide)

The Problem You're Solving

Your query takes 8 seconds to run on 1M rows:

-- āŒ Slow: Full table scan
SELECT * FROM users WHERE email = 'alice@example.com';
-- Scans all 1,000,000 rows āœ—

-- āœ… Fast: Uses index
CREATE INDEX idx_users_email ON users(email);
SELECT * FROM users WHERE email = 'alice@example.com';
-- Scans only 1 row āœ“

-- Time: 8 seconds → 2 milliseconds (4000x faster!)

That difference = timeouts and frustrated users vs instant responses.

SQL optimization appears in 25% of database interviews and directly impacts application scalability.

How Databases Work

B-Tree Index

Without index (full scan):
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│ Row 1: alice@...        │ ← Check
│ Row 2: bob@...          │ ← Check
│ Row 3: charlie@...      │ ← Check
│ ... (1M rows) ...       │ ← Check each
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
Total: 1M comparisons

With index (B-Tree):
         email index
             │
        ā”Œā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”
        │          │
      a-m        n-z
       │          │
    ā”Œā”€ā”€ā”“ā”€ā”€ā”    ā”Œā”€ā”€ā”“ā”€ā”€ā”
   a-c  d-m   n-s  t-z
    │         │
alice@...  (not here)
           └─ Found in 3 lookups!

Total: 3-4 lookups (vs 1M scans)

Basic Indexes

Single Column Index

-- Create index
CREATE INDEX idx_users_email ON users(email);

-- Check it exists
SHOW INDEX FROM users;
-- idx_users_email on email column

-- Query now fast
SELECT * FROM users WHERE email = 'alice@example.com';
-- Uses index: index seek (2-3ms)

Composite Index (Multiple Columns)

-- Create on multiple columns
CREATE INDEX idx_users_name_email ON users(name, email);

-- Good for:
SELECT * FROM users WHERE name = 'Alice' AND email = 'alice@example.com';
-- Uses index

-- Less good for:
SELECT * FROM users WHERE email = 'alice@example.com';
-- Index scan (worse than specific email index)

-- Bad for:
SELECT * FROM users WHERE name = 'Alice';
-- Uses index (name is first)

-- Bad for:
SELECT * FROM users WHERE email = 'alice@example.com' AND name = 'Alice';
-- Uses index but less efficient (column order matters)

Column Order Matters (Leftmost Prefix Rule)

-- Index on (category, price, date)
CREATE INDEX idx_products ON products(category, price, date);

-- āœ… GOOD - Uses index
SELECT * FROM products WHERE category = 'Electronics';
SELECT * FROM products WHERE category = 'Electronics' AND price > 100;
SELECT * FROM products WHERE category = 'Electronics' AND price > 100 AND date > '2024-01-01';

-- āš ļø SLOWER - Doesn't use index efficiently
SELECT * FROM products WHERE price > 100 AND date > '2024-01-01';
-- (missing category in WHERE)

-- āŒ BAD - Doesn't use index
SELECT * FROM products WHERE date > '2024-01-01';
-- (missing category and price)

Execution Plans: Read What Database Does

MySQL EXPLAIN

-- See how query executes
EXPLAIN SELECT * FROM users WHERE email = 'alice@example.com';

-- Output:
-- id | select_type | table | type   | possible_keys      | key                | rows | Extra
-- 1  | SIMPLE      | users | ref    | idx_users_email    | idx_users_email    | 1    | NULL

-- Key columns to understand:
-- type: ref (good), index, range (ok), ALL (bad)
-- rows: 1 (good), 1000000 (bad)
-- Extra: NULL (good), using filesort (bad), using temporary (bad)

EXPLAIN Output Types (Best to Worst)

system    - 1 row (system table)
const     - 1 row (index on primary key)
eq_ref    - Joins using primary key
ref       - Index lookup returning multiple rows (GOOD)
range     - Index range query (GOOD)
index     - Full index scan
ALL       - Full table scan (WORST)

Detect Bad Queries

-- āŒ ALL - Full table scan (slow on large tables)
EXPLAIN SELECT * FROM users;
-- type: ALL (scans all rows)

-- āœ… FIX - Add WHERE clause or index
EXPLAIN SELECT * FROM users WHERE status = 'active';
-- type: ref (uses index on status)

Real-World Optimization Examples

Example 1: Slow Report Query

-- āŒ SLOW - 45 seconds
SELECT
  users.name,
  COUNT(orders.id) as order_count,
  SUM(orders.total) as total_spent
FROM users
LEFT JOIN orders ON users.id = orders.user_id
WHERE orders.created_at > '2024-01-01'
GROUP BY users.id;
-- Uses full table scans

-- āœ… OPTIMIZED - 200ms
-- Add indexes
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_created_at ON orders(created_at);

-- Better query (filter before join)
SELECT
  u.name,
  COUNT(o.id) as order_count,
  SUM(o.total) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id AND o.created_at > '2024-01-01'
GROUP BY u.id;
-- 45s → 200ms (225x faster!)

Example 2: N+1 Query Problem

-- āŒ N+1 PROBLEM - 1001 queries
SELECT id, name FROM users LIMIT 100;
-- Returns 100 users

-- Then in application code:
for user in users:
    orders = SELECT * FROM orders WHERE user_id = user.id;
    # Now 100 more queries! Total: 101 queries

-- āœ… FIX - Single JOIN
SELECT
  u.id,
  u.name,
  o.order_id,
  o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.id IN (1, 2, 3, ..., 100);
-- 1 query instead of 101

Example 3: Avoid Functions on Indexed Columns

-- āŒ SLOW - Index not used
SELECT * FROM users
WHERE YEAR(created_at) = 2024;
-- Function YEAR() prevents index use

-- āœ… FAST - Index used
SELECT * FROM users
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
-- Index can be used for range query

Example 4: Use Covering Index

-- āŒ Requires full row lookup
SELECT name, email FROM users WHERE email = 'alice@example.com';
-- Index idx_email lookup, then fetch full row

-- āœ… Covering index (has all needed columns)
CREATE INDEX idx_users_email_name_email ON users(email, name);
SELECT name, email FROM users WHERE email = 'alice@example.com';
-- Index contains all data needed (no table lookup)

Query Optimization Techniques

1. Use Prepared Statements

-- āŒ Inefficient - Query compiled each time
SELECT * FROM users WHERE id = 123;
SELECT * FROM users WHERE id = 456;
SELECT * FROM users WHERE id = 789;
-- 3 separate compilations

-- āœ… Efficient - Compiled once, reused
PREPARE stmt FROM 'SELECT * FROM users WHERE id = ?';
EXECUTE stmt USING 123;
EXECUTE stmt USING 456;
EXECUTE stmt USING 789;
-- Compiled once, executed 3x

2. Batch Queries

-- āŒ Slow - 1000 individual INSERTs
INSERT INTO logs (user_id, action) VALUES (1, 'login');
INSERT INTO logs (user_id, action) VALUES (2, 'logout');
-- ... 1000 times

-- āœ… Fast - Batch INSERT
INSERT INTO logs (user_id, action) VALUES
  (1, 'login'),
  (2, 'logout'),
  (3, 'click'),
  -- ... all 1000 rows
  (1000, 'logout');
-- 1000x faster

3. Denormalization (Strategic)

-- Normalized (requires join):
SELECT user.name, SUM(order.total)
FROM users
JOIN orders ON user.id = order.user_id
GROUP BY user.id;

-- Denormalized (cached calculation):
ALTER TABLE users ADD COLUMN total_spent DECIMAL(10,2);
-- Update total_spent when new order created
-- SELECT user.name, total_spent FROM users;
-- Faster (no join) but need to maintain total_spent

Common Mistakes

āŒ Mistake 1: Indexing Everything

-- WRONG - Index creates overhead (storage, writes slower)
CREATE INDEX idx_col1 ON table(col1);
CREATE INDEX idx_col2 ON table(col2);
Create INDEX idx_col3 ON table(col3);
-- Too many indexes slow down inserts/updates

-- CORRECT - Index only frequently searched columns
CREATE INDEX idx_email ON users(email);  -- 100 searches/day
-- Don't index rarely-searched columns

āŒ Mistake 2: Not Using EXPLAIN

-- WRONG - Assume query is optimized
SELECT * FROM huge_table WHERE name LIKE '%alice%';
-- Might be slow (full scan) but don't know

-- CORRECT - Always check
EXPLAIN SELECT * FROM huge_table WHERE name LIKE '%alice%';
-- Output shows it's full scan → add index
CREATE INDEX idx_name ON huge_table(name);

āŒ Mistake 3: Using SELECT * Instead of Specific Columns

-- WRONG - Gets all columns even if don't need them
SELECT * FROM huge_table;
-- Transfers unnecessary data, slower network

-- CORRECT - Select only needed columns
SELECT id, name FROM huge_table;
-- Faster network transfer, smaller result set

FAQ: SQL Optimization

Q1: When should I add an index?

A: When query filtering on column is slow.

-- Slow query?
EXPLAIN SELECT * FROM users WHERE email = 'alice@example.com';
-- Output: type ALL, rows 1000000

-- Add index
CREATE INDEX idx_users_email ON users(email);

-- Now fast
EXPLAIN SELECT * FROM users WHERE email = 'alice@example.com';
-- Output: type ref, rows 1

Q2: What's the impact of indexes on writes?

A: Inserts/updates slower (must update index too).

No index:
  INSERT: 1ms

1 index:
  INSERT: 1.5ms (update index too)

10 indexes:
  INSERT: 5ms (update all 10 indexes)

Balance: Fast reads vs fast writes.


Q3: Interview Question: Optimize this query.

A: Here's methodology:

-- SLOW: 30 seconds
SELECT u.name, COUNT(o.id) as orders
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id;

-- Analysis:
-- 1. Check EXPLAIN (it shows ALL scan on orders)
-- 2. Add index on orders.user_id
CREATE INDEX idx_orders_user_id ON orders(user_id);

-- 3. Add index on users.created_at
CREATE INDEX idx_users_created_at ON users(created_at);

-- 4. Test again: now 500ms
-- 5. If still slow, add composite index
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);

-- Final: 500ms → 50ms (10x improvement)

Interview insight: "I'd start with EXPLAIN, identify bottlenecks, add indexes to slow columns, test, then denormalize if needed."


Q4: What's a good query?

A: Less rows examined than returned.

-- BAD
EXPLAIN SELECT * FROM users WHERE active = 1;
-- rows examined: 1000000, rows returned: 100
-- (scanning 10,000x more rows than needed)

-- GOOD
CREATE INDEX idx_users_active ON users(active);
-- rows examined: 100, rows returned: 100
-- (scanning exactly what's needed)

Conclusion

SQL optimization fundamentals:

  1. Understand EXPLAIN - Read execution plans
  2. Index correctly - Leftmost prefix rule
  3. Avoid pitfalls - Functions, N+1 queries, SELECT *
  4. Test changes - Benchmark before/after
  5. Balance trade-offs - Read speed vs write speed

Master these and you'll write queries that scale to billions of rows.


Learn More

Share this article

Related Articles