Skip to content
Published on

MySQL Query Optimization Practical Guide: From EXPLAIN Analysis to Index Design and Slow Query Tuning

Authors
  • Name
    Twitter
MySQL Query Optimization

Introduction

MySQL is one of the most widely used relational databases in the world. However, once data grows beyond tens of millions of rows and concurrent connections increase, a single inefficient query can bring down an entire service. This guide covers everything you need for MySQL query optimization, from analyzing query execution plans with EXPLAIN, to index design strategies, query rewriting patterns, optimizer hints, slow query log analysis, and InnoDB buffer pool tuning -- all with practical examples.

This guide targets developers and DBAs already familiar with basic MySQL syntax, and is based on MySQL 8.0 and above (with a focus on 8.4 improvements).

Complete Guide to EXPLAIN Analysis

Basic EXPLAIN Usage

EXPLAIN is an essential tool for previewing how the MySQL optimizer will execute a query. Basic usage is straightforward.

EXPLAIN SELECT u.name, o.total_amount
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active'
  AND o.created_at >= '2026-01-01';

The output displays one row for each table access method in the query.

Interpreting the type Column

The type column indicates the table access method and has the most direct impact on performance. Listed from best (fastest) to worst:

type ValueDescriptionPerformance
constSingle row lookup via primary key or unique indexBest
eq_refPrimary key/unique index match in JOINVery Good
refNon-unique index lookup for equal valuesGood
rangeIndex range scan (BETWEEN, IN, etc.)Average
indexFull index scan (reads entire index tree)Poor
ALLFull table scanWorst
-- const: Single row lookup by primary key
EXPLAIN SELECT * FROM users WHERE id = 1;
-- type: const, rows: 1

-- ref: Non-unique index lookup
EXPLAIN SELECT * FROM orders WHERE user_id = 100;
-- type: ref (when user_id has an index)

-- ALL: Full scan - must be optimized
EXPLAIN SELECT * FROM orders WHERE YEAR(created_at) = 2026;
-- type: ALL (index unusable due to function application)

Interpreting the Extra Column

The Extra column displays additional execution strategies used by the optimizer. Key values to watch for:

Extra ValueMeaningAction Required
Using indexResolved via covering indexGood (maintain)
Using whereWHERE clause filtering performedNormal
Using filesortAdditional sort operation neededNeeds improvement
Using temporaryTemporary table creation neededNeeds improvement
Using index conditionIndex condition pushdown appliedGood

Verifying Actual Execution Time with EXPLAIN ANALYZE

Starting from MySQL 8.0.18, EXPLAIN ANALYZE is available. Unlike EXPLAIN, it actually executes the query and shows real execution times and actual row counts.

EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) as order_count
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.created_at >= '2026-01-01'
GROUP BY u.id
HAVING order_count > 5
ORDER BY order_count DESC
LIMIT 10;

The output includes estimated rows, actual rows, and actual time, allowing you to identify discrepancies between optimizer estimates and reality. If the difference between estimated and actual row counts is 10x or more, statistics need updating.

-- Manually refresh statistics
ANALYZE TABLE users;
ANALYZE TABLE orders;

Index Types and Characteristics

B-Tree Index

This is the default index type for MySQL InnoDB. It is optimized for equality comparisons, range searches, sorting, and leftmost prefix matching.

-- Creating a B-Tree index
CREATE INDEX idx_orders_user_created ON orders (user_id, created_at);

-- Queries where this index is effective
SELECT * FROM orders WHERE user_id = 100 AND created_at >= '2026-01-01';
SELECT * FROM orders WHERE user_id = 100 ORDER BY created_at DESC;
SELECT * FROM orders WHERE user_id IN (100, 200, 300);

Hash Index

Can only be used explicitly with the MEMORY engine. In InnoDB, it is automatically managed as an Adaptive Hash Index. It can only be used for equality (=) searches and cannot support range queries or sorting.

-- Using Hash index on a MEMORY engine table
CREATE TABLE session_cache (
    session_id VARCHAR(64) NOT NULL,
    user_id INT NOT NULL,
    data JSON,
    INDEX USING HASH (session_id)
) ENGINE = MEMORY;

-- Check InnoDB Adaptive Hash Index status
SHOW ENGINE INNODB STATUS\G
-- Check hit rate in the Adaptive Hash Index section

Full-text Index

A specialized index for natural language text search. Supported in InnoDB since MySQL 5.6, used with the MATCH ... AGAINST syntax.

-- Creating a Full-text index
ALTER TABLE articles ADD FULLTEXT INDEX ft_idx_content (title, body);

-- Natural language mode search
SELECT * FROM articles
WHERE MATCH(title, body) AGAINST('MySQL performance optimization' IN NATURAL LANGUAGE MODE);

-- Boolean mode search (AND, OR, NOT, etc.)
SELECT * FROM articles
WHERE MATCH(title, body) AGAINST('+MySQL +performance -PostgreSQL' IN BOOLEAN MODE);

Spatial Index

An R-Tree based index for spatial data (location information, etc.). Used with GEOMETRY, POINT, and similar column types.

-- Creating a Spatial index
CREATE TABLE locations (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100),
    position POINT NOT NULL SRID 4326,
    SPATIAL INDEX (position)
);

-- Search within radius
SELECT name, ST_Distance_Sphere(position, ST_SRID(POINT(127.0276, 37.4979), 4326)) AS distance
FROM locations
WHERE ST_Distance_Sphere(position, ST_SRID(POINT(127.0276, 37.4979), 4326)) < 1000;

Composite Index Design Strategies

Leftmost Prefix Rule

A composite index can only be utilized from the leftmost columns in a continuous prefix order. This is the most important principle in MySQL index design.

-- Composite index: (a, b, c)
CREATE INDEX idx_abc ON orders (status, user_id, created_at);

-- Queries that CAN use the index
SELECT * FROM orders WHERE status = 'completed';                          -- (a) used
SELECT * FROM orders WHERE status = 'completed' AND user_id = 100;        -- (a, b) used
SELECT * FROM orders WHERE status = 'completed' AND user_id = 100
  AND created_at >= '2026-01-01';                                         -- (a, b, c) all used

-- Queries that CANNOT use the index
SELECT * FROM orders WHERE user_id = 100;                                 -- only (b) - index not used
SELECT * FROM orders WHERE user_id = 100 AND created_at >= '2026-01-01';  -- (b, c) - index not used
SELECT * FROM orders WHERE created_at >= '2026-01-01';                    -- only (c) - index not used

Covering Index

When all columns needed by a query are included in the index, InnoDB can return results from the secondary index alone without accessing the table data (clustered index). EXPLAIN shows "Using index" in the Extra column.

-- Covering index design
CREATE INDEX idx_covering ON orders (user_id, status, total_amount);

-- This query can return results from index only (Using index)
EXPLAIN SELECT user_id, status, total_amount
FROM orders
WHERE user_id = 100 AND status = 'completed';

-- This query does NOT use covering index (SELECT * requires table data access)
EXPLAIN SELECT * FROM orders
WHERE user_id = 100 AND status = 'completed';

Practical Index Design Patterns

Follow these principles when determining composite index column order:

  1. Place equality (=) condition columns first
  2. Place range condition columns last (columns after a range condition cannot use the index)
  3. Prioritize high-cardinality columns (general recommendation, but query patterns matter more)
  4. Consider ORDER BY / GROUP BY columns
-- Bad design: Index on low-cardinality column only
CREATE INDEX idx_bad ON orders (status);  -- status has only 5-10 distinct values

-- Good design: Composite index matching query patterns
CREATE INDEX idx_good ON orders (user_id, status, created_at);

-- Design considering sorting
-- When WHERE user_id = ? ORDER BY created_at DESC is frequent
CREATE INDEX idx_sort ON orders (user_id, created_at DESC);

Query Rewriting Patterns

Converting Subqueries to JOINs

The MySQL optimizer often handles correlated subqueries inefficiently. Rewriting them as JOINs can significantly improve performance.

-- Bad: Correlated subquery (subquery executed for each row)
SELECT u.name, u.email,
  (SELECT COUNT(*) FROM orders o WHERE o.user_id = u.id) AS order_count
FROM users u
WHERE u.status = 'active';

-- Good: LEFT JOIN + GROUP BY
SELECT u.name, u.email, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active'
GROUP BY u.id, u.name, u.email;

-- Alternative: Using a Derived Table
SELECT u.name, u.email, COALESCE(oc.cnt, 0) AS order_count
FROM users u
LEFT JOIN (
    SELECT user_id, COUNT(*) AS cnt
    FROM orders
    GROUP BY user_id
) oc ON u.id = oc.user_id
WHERE u.status = 'active';

Converting OR to UNION

OR conditions can cause inefficient index usage or trigger full table scans.

-- Bad: Index underutilization due to OR condition
SELECT * FROM products
WHERE category_id = 10 OR brand_id = 20;

-- Good: Split using UNION ALL (each condition can use its own index)
SELECT * FROM products WHERE category_id = 10
UNION ALL
SELECT * FROM products WHERE brand_id = 20 AND category_id != 10;

Avoiding Functions on Indexed Columns

Applying functions to indexed columns prevents index usage. Since MySQL 8.0, you can create Expression Indexes, but rewriting the query is preferred when possible.

-- Bad: Function applied to indexed column
SELECT * FROM orders WHERE YEAR(created_at) = 2026;
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

-- Good: Rewrite as range condition
SELECT * FROM orders
WHERE created_at >= '2026-01-01' AND created_at < '2027-01-01';

-- MySQL 8.0+: Expression Index (Functional Index)
CREATE INDEX idx_email_lower ON users ((LOWER(email)));
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

Preventing Implicit Type Conversion

When the column type and comparison value type differ, MySQL performs implicit type conversion, which causes indexes to be ignored.

-- Bad: phone_number is VARCHAR but compared with a number
SELECT * FROM users WHERE phone_number = 01012345678;
-- MySQL converts phone_number to numeric -> index not used

-- Good: Type matching
SELECT * FROM users WHERE phone_number = '01012345678';

Optimizer Hints

Index Hints

-- USE INDEX: Suggest a specific index (optimizer may ignore)
SELECT * FROM orders USE INDEX (idx_user_created)
WHERE user_id = 100 AND created_at >= '2026-01-01';

-- FORCE INDEX: Force a specific index (must use index instead of full scan)
SELECT * FROM orders FORCE INDEX (idx_user_created)
WHERE user_id = 100 AND created_at >= '2026-01-01';

-- IGNORE INDEX: Exclude a specific index
SELECT * FROM orders IGNORE INDEX (idx_status)
WHERE status = 'completed' AND user_id = 100;

MySQL 8.0+ Optimizer Hint Syntax

The new hint syntax introduced in MySQL 8.0 is written in comment-style format.

-- Fix JOIN order
SELECT /*+ JOIN_ORDER(u, o) */
  u.name, o.total_amount
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active';

-- Specify index for a specific table
SELECT /*+ INDEX(o idx_user_created) */
  o.*
FROM orders o
WHERE o.user_id = 100;

-- Force Hash Join
SELECT /*+ HASH_JOIN(u, o) */
  u.name, o.total_amount
FROM users u
JOIN orders o ON u.id = o.user_id;

-- Parallel query execution (MySQL 8.0.14+)
SELECT /*+ SET_VAR(innodb_parallel_read_threads=4) */
  COUNT(*) FROM large_table;

optimizer_switch System Variable

-- Check current settings
SHOW VARIABLES LIKE 'optimizer_switch';

-- Disable specific optimizations at session level
SET SESSION optimizer_switch = 'index_merge_intersection=off';
SET SESSION optimizer_switch = 'derived_merge=off';
SET SESSION optimizer_switch = 'batched_key_access=on';

-- MySQL 8.0.31+: Histogram-based optimization
ANALYZE TABLE orders UPDATE HISTOGRAM ON user_id, status WITH 100 BUCKETS;

Slow Query Log Configuration and Analysis

Slow Query Log Configuration

-- Enable slow query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;  -- Record queries taking more than 1 second
SET GLOBAL log_queries_not_using_indexes = 'ON';  -- Also record queries not using indexes
SET GLOBAL slow_query_log_file = '/var/log/mysql/slow-query.log';

-- Permanent configuration in my.cnf
-- [mysqld]
-- slow_query_log = 1
-- long_query_time = 1
-- log_queries_not_using_indexes = 1
-- slow_query_log_file = /var/log/mysql/slow-query.log
-- min_examined_row_limit = 1000

Analysis with mysqldumpslow

# Top 10 slowest queries
mysqldumpslow -s t -t 10 /var/log/mysql/slow-query.log

# Top 10 most frequent slow queries
mysqldumpslow -s c -t 10 /var/log/mysql/slow-query.log

# Filter by specific pattern
mysqldumpslow -s t -t 10 -g "orders" /var/log/mysql/slow-query.log

Detailed Analysis with pt-query-digest

Percona Toolkit's pt-query-digest is a more powerful analysis tool.

# Analyze slow log
pt-query-digest /var/log/mysql/slow-query.log

# Analyze specific time range
pt-query-digest --since="2026-03-10 00:00:00" --until="2026-03-11 00:00:00" \
  /var/log/mysql/slow-query.log

# Save results to file
pt-query-digest /var/log/mysql/slow-query.log > /tmp/slow-query-report.txt

The output includes per-query execution count, average/maximum execution time, rows examined, and rows sent. If the ratio of rows examined to rows sent exceeds 100:1, index improvements are needed.

InnoDB Buffer Pool Tuning

Buffer Pool Size Configuration

The InnoDB buffer pool is the core component that caches data and indexes in memory. Typically, 70-80% of total memory is allocated.

-- Check current buffer pool status
SHOW VARIABLES LIKE 'innodb_buffer_pool%';
SHOW STATUS LIKE 'Innodb_buffer_pool%';

-- Calculate buffer pool hit rate
SELECT
  (1 - (
    (SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME = 'Innodb_buffer_pool_reads') /
    (SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME = 'Innodb_buffer_pool_read_requests')
  )) * 100 AS buffer_pool_hit_rate;
-- 99% or higher is desirable

Buffer Pool Instance Separation

In high-concurrency environments, splitting the buffer pool into multiple instances reduces mutex contention.

# my.cnf configuration example
[mysqld]
innodb_buffer_pool_size = 32G
innodb_buffer_pool_instances = 8    # Buffer pool size / instances = 4GB each
innodb_buffer_pool_chunk_size = 1G  # Online resizing unit
innodb_log_file_size = 4G
innodb_log_buffer_size = 64M
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000

Warm-up (Buffer Pool Dump/Load)

Prevent cold start performance degradation when the buffer pool is empty after MySQL restart.

-- Dump buffer pool at shutdown, load at startup
SET GLOBAL innodb_buffer_pool_dump_at_shutdown = ON;
SET GLOBAL innodb_buffer_pool_load_at_startup = ON;

-- Manual dump/load
SET GLOBAL innodb_buffer_pool_dump_now = ON;
SET GLOBAL innodb_buffer_pool_load_now = ON;

-- Check load progress
SHOW STATUS LIKE 'Innodb_buffer_pool_load_status';

MySQL 8.0 vs 8.4 Optimizer Improvement Comparison

FeatureMySQL 8.0MySQL 8.4
EXPLAIN FORMATTRADITIONAL, JSON, TREETRADITIONAL, JSON, TREE (improved)
Hash JoinSupported from 8.0.18Performance and memory management improvements
Window FunctionsBasic supportExecution plan optimization improvements
Derived Table MergePartial supportBroader scope support
Invisible IndexSupportedSupported (management improvements)
Functional IndexSupportedSupported (Expression Index improvements)
Histogram StatisticsSupportedEnhanced auto-refresh capability
Parallel QueryLimitedExpanded InnoDB parallel reads
Cost ModelDefault cost modelCost constants recalibrated, SSD-aware
Subquery OptimizationBasic semi-joinExpanded anti-join and semi-join strategies
EXPLAIN ANALYZEIntroduced in 8.0.18Improved output format, memory usage display
-- MySQL 8.4: EXPLAIN improvement example
EXPLAIN FORMAT=TREE
SELECT u.name, COUNT(o.id)
FROM users u
JOIN orders o ON u.id = o.user_id
GROUP BY u.id;

-- Using Invisible Index (test before dropping an index)
ALTER TABLE orders ALTER INDEX idx_status INVISIBLE;
-- After performance testing, if no issues, drop it
-- DROP INDEX idx_status ON orders;
-- If issues arise, reactivate
ALTER TABLE orders ALTER INDEX idx_status VISIBLE;

Failure Cases and Recovery Procedures

Case 1: Index Not Used Due to Implicit Type Conversion

-- Problem: Numeric comparison on a VARCHAR column
-- account_no column is VARCHAR(20) but application passes a number
EXPLAIN SELECT * FROM accounts WHERE account_no = 123456;
-- type: ALL (full table scan!)

-- Cause: MySQL converts account_no column to numeric, making index unusable
-- Fix: Match the types
EXPLAIN SELECT * FROM accounts WHERE account_no = '123456';
-- type: ref (index used)

Case 2: Index Skipped Due to Low Cardinality

-- Problem: Index exists but optimizer chooses full scan
-- status column has an index but 'active' accounts for 90% of all rows
EXPLAIN SELECT * FROM users WHERE status = 'active';
-- type: ALL (optimizer determines full scan is more efficient)

-- Cause: Low cardinality makes the index inefficient
-- Fix 1: Create a composite index with other conditions
CREATE INDEX idx_status_created ON users (status, created_at);

-- Fix 2: When the index is only useful for specific values
-- MySQL does not directly support Partial Indexes, so consider changing query patterns

Case 3: Wrong Composite Index Column Order

-- Problem: Index created as (created_at, user_id)
-- Frequent query: WHERE user_id = ? AND created_at >= ?
EXPLAIN SELECT * FROM orders
WHERE user_id = 100 AND created_at >= '2026-01-01';
-- Inefficient index usage (range scan on created_at first)

-- Fix: Place equality condition columns first
DROP INDEX idx_created_user ON orders;
CREATE INDEX idx_user_created ON orders (user_id, created_at);
-- type: range (efficient range scan)

Case 4: Index Fragmentation After Mass DELETE

-- Problem: Query performance degradation after deleting 5 million rows
-- Cause: Index pages have many empty spaces (fragmentation)

-- Diagnosis
SELECT
  TABLE_NAME,
  INDEX_LENGTH,
  DATA_LENGTH,
  DATA_FREE
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'orders';

-- Fix: Rebuild indexes
ALTER TABLE orders ENGINE=InnoDB;  -- Rebuild table + indexes
-- Or
OPTIMIZE TABLE orders;

Production Optimization Checklist

Index Review

  • Verify all foreign key columns have indexes
  • Confirm EXPLAIN results for major queries show no type ALL
  • Identify and remove unused indexes
  • Remove duplicate indexes
  • Verify composite index column order matches query patterns
-- Find unused indexes
SELECT
  s.TABLE_SCHEMA,
  s.TABLE_NAME,
  s.INDEX_NAME,
  s.COLUMN_NAME
FROM INFORMATION_SCHEMA.STATISTICS s
LEFT JOIN performance_schema.table_io_waits_summary_by_index_usage p
  ON s.TABLE_SCHEMA = p.OBJECT_SCHEMA
  AND s.TABLE_NAME = p.OBJECT_NAME
  AND s.INDEX_NAME = p.INDEX_NAME
WHERE p.COUNT_STAR = 0
  AND s.TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
  AND s.INDEX_NAME != 'PRIMARY'
ORDER BY s.TABLE_SCHEMA, s.TABLE_NAME;

-- Find duplicate indexes (using sys schema)
SELECT * FROM sys.schema_redundant_indexes;

Query Review

  • Enable slow query log and analyze periodically
  • Verify no queries apply functions to indexed columns
  • Verify no queries cause implicit type conversion
  • Use specific column names instead of SELECT *
  • Eliminate N+1 query patterns

Server Configuration Review

  • Confirm innodb_buffer_pool_size is 70-80% of total memory
  • Confirm innodb_buffer_pool_instances is appropriate (minimum 8, 1 instance per 1GB buffer pool)
  • Confirm innodb_log_file_size is sufficient (1-4GB recommended)
  • Check innodb_flush_log_at_trx_commit value (1: safe, 2: compromise)
  • Enable innodb_buffer_pool_dump_at_shutdown/load_at_startup

Monitoring

  • Maintain buffer pool hit rate above 99%
  • Monitor slow query occurrence frequency
  • Periodically check index usage rates
  • Verify automatic table statistics refresh
  • Monitor lock waits and deadlocks

Operational Notes

Cautions When Modifying Indexes on Large Tables

In MySQL 8.0, most ALTER TABLE ... ADD INDEX operations are processed online (Instant or In-place), but pre-testing is essential in production environments.

-- Check lock mode when adding an index
ALTER TABLE orders ADD INDEX idx_new (col1, col2), ALGORITHM=INPLACE, LOCK=NONE;

-- Use pt-online-schema-change (recommended for large tables)
-- Synchronizes the original table via triggers while making schema changes
pt-online-schema-change \
  --alter "ADD INDEX idx_new (col1, col2)" \
  --execute \
  D=mydb,t=orders

Note on Query Cache

Query Cache has been completely removed since MySQL 8.0. It should be replaced with application-level caching (Redis, Memcached, etc.).

Using Histogram Statistics

Histograms introduced in MySQL 8.0 provide column value distribution information to the optimizer, helping it choose better execution plans.

-- Create histogram
ANALYZE TABLE orders UPDATE HISTOGRAM ON status WITH 100 BUCKETS;
ANALYZE TABLE orders UPDATE HISTOGRAM ON total_amount WITH 254 BUCKETS;

-- Check histogram
SELECT
  SCHEMA_NAME,
  TABLE_NAME,
  COLUMN_NAME,
  JSON_EXTRACT(HISTOGRAM, '$.histogram-type') AS histogram_type,
  JSON_EXTRACT(HISTOGRAM, '$.number-of-buckets-specified') AS buckets
FROM INFORMATION_SCHEMA.COLUMN_STATISTICS;

-- Drop histogram
ANALYZE TABLE orders DROP HISTOGRAM ON status;

References