MySQL Range Optimization – Where Clause
In this post we will see how can we optimize where clause. Where clause restrict the result set to a range. Basically this is used to access the range of a table/view. The range access method uses a single index to retrieve a subset of table rows that are contained within one or several index value intervals. It can be used for a single-part or multiple-part index.
The Range Access Method for Single-Part Indexes
For a single-part index, index value intervals can be conveniently represented by corresponding conditions in the WHERE clause, so we speak of range conditions rather than “intervals.” The definition of a range condition for a single-part index is as follows:
- For both BTREE and HASH indexes, comparison of a key part with a constant value is a range condition when using the =, <=>, IN(), IS NULL, or IS NOT NULL operators.
- Additionally, for BTREE indexes, comparison of a key part with a constant value is a range condition when using the >, <, >=, <=, BETWEEN, !=, or <> operators, or LIKE comparisons if the argument to LIKE is a constant string that does not start with a wildcard character.
- For all types of indexes, multiple range conditions combined with OR or AND form a range condition.
**Constant value in the preceding descriptions means one of the following:
- A constant from the query string
- A column of a const or system table from the same join
- The result of an uncorrelated subquery
- Any expression composed entirely from subexpressions of the preceding types
Example:
SELECT * FROM t1 WHERE key_col > 1 AND key_col < 10;
SELECT * FROM t1 WHERE key_col = 1 OR key_col IN (15,18,20);
SELECT * FROM t1 WHERE key_col LIKE ‘ab%’ OR key_col BETWEEN ‘bar’ AND ‘foo’;
- MySQL tries to extract range conditions from the WHERE clause for each of the possible indexes.
- During the extraction process, conditions that cannot be used for constructing the range condition are dropped, conditions that produce overlapping ranges are combined, and conditions that produce empty ranges are removed.
- In general, the condition used for a range scan is less restrictive than the WHERE clause. MySQL performs an additional check to filter out rows that satisfy the range condition but not the full WHERE clause.
- The range condition extraction algorithm can handle nested AND/OR constructs of arbitrary depth, and its output does not depend on the order in which conditions appear in WHERE clause.
- MySQL does not support merging multiple ranges for the range access method for spatial indexes. To work around this limitation, you can use a UNION with identical SELECT statements, except that you put each spatial predicate in a different SELECT.
The Range Access Method for Multiple-Part Indexes
- Range conditions on a multiple-part index are an extension of range conditions for a single-part index.
- A range condition on a multiple-part index restricts index rows to lie within one or several key tuple intervals.
- Key tuple intervals are defined over a set of key tuples, using ordering from the index.
- For HASH indexes, each interval containing identical values can be used.
- For a BTREE index, an interval might be usable for conditions combined with AND, where each condition compares a key part with a constant value using =, <=>, IS NULL, >, <, >=, <=, !=, <>, BETWEEN, or LIKE ‘pattern’ (where ‘pattern’ does not start with a wildcard). An interval can be used as long as it is possible to determine a single key tuple containing all rows that match the condition (or two intervals if <> or != is used).
- The optimizer attempts to use additional key parts to determine the interval as long as the comparison operator is =, <=>, or IS NULL. If the operator is >, <, >=, <=, !=, <>, BETWEEN, or LIKE, the optimizer uses it but considers no more key parts. For the following expression, the optimizer uses = from the first comparison. It also uses >= from the second comparison but considers no further key parts and does not use the third comparison for interval construction.
- If conditions that cover sets of rows contained within intervals are combined with OR, they form a condition that covers a set of rows contained within the union of their intervals. If the conditions are combined with AND, they form a condition that covers a set of rows contained within the intersection of their intervals.