MySQL Index : Understanding MySQL indexing
What is Indexing?
A database index is a data structure that improves the speed of operations in a table. Indexes can be created using one or more columns, providing the basis for both rapid random lookups and efficient ordering of access to records.
- While creating index, it should be considered that what are the columns which will be used to make SQL queries and create one or more indexes on those columns.
- Practically, indexes are also type of tables, which keep primary key or index field and a pointer to each record into the actual table.
- The users cannot see the indexes, they are just used to speed up queries and will be used by Database Search Engine to locate records very fast.
- INSERT and UPDATE statements take more time on tables having indexes where as SELECT statements become fast on those tables. The reason is that while doing insert or update, database need to insert or update index values as well.
Basically an index on a table works like an index in a book (that’s where the name came from):
Let’s say you have a book about databases and you want to find some information about, say, storage. Without an index (assuming no other aid, such as a table of contents) you’d have to go through the pages one by one, until you found the topic (that’s a full table scan). On the other hand, an index has a list of keywords, so you’d consult the index and see that storage is mentioned on pages 113-120,231 and 354. Then you could flip to those pages directly, without searching (that’s a search with an index, somewhat faster).
Of course, how useful the index will be, depends on many things – a few examples, using the simile above:
- if you had a book on databases and indexed the word “database”, you’d see that it’s mentioned on pages 1-59,61-290, and 292 to 400. In such case, the index is not much help and it might be faster to go through the pages one by one (in a database, this is “poor selectivity”).
- For a 10-page book, it makes no sense to make an index, as you may end up with a 10-page book prefixed by a 5-page index, which is just silly – just scan the 10 pages and be done with it.
- The index also needs to be useful – there’s generally no point to index e.g. the frequency of the letter “L” per page.
How MySQL Uses Indexes
- To find the rows matching a WHERE clause quickly.
- To eliminate rows from consideration. If there is a choice between multiple indexes, MySQL normally uses the index that finds the smallest number of rows (the most selective index).
- If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to look up rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3).
- To retrieve rows from other tables when performing joins. MySQL can use indexes on columns more efficiently if they are declared as the same type and size. In this context, VARCHAR and CHAR are considered the same if they are declared as the same size. For example, VARCHAR(10) and CHAR(10) are the same size, but VARCHAR(10) and CHAR(15) are not.
- For comparisons between nonbinary string columns, both columns should use the same character set. For example, comparing a utf8 column with a latin1 column precludes use of an index.
- Comparison of dissimilar columns (comparing a string column to a temporal or numeric column, for example) may prevent use of indexes if values cannot be compared directly without conversion. For a given value such as 1 in the numeric column, it might compare equal to any number of values in the string column such as ‘1’, ‘ 1’, ‘00001’, or ’01.e1′. This rules out use of any indexes for the string column.
- To find the MIN() or MAX() value for a specific indexed column key_col. This is optimized by a preprocessor that checks whether you are using WHERE key_part_N = constant on all key parts that occur before key_col in the index. In this case, MySQL does a single key lookup for each MIN() or MAX() expression and replaces it with a constant. If all expressions are replaced with constants, the query returns at once. For example:
SELECT MIN(key_part2),MAX(key_part2) FROM tbl_name WHERE key_part1=10;
- To sort or group a table if the sorting or grouping is done on a leftmost prefix of a usable index (for example, ORDER BY key_part1, key_part2). If all key parts are followed by DESC, the key is read in reverse order.
- In some cases, a query can be optimized to retrieve values without consulting the data rows. (An index that provides all the necessary results for a query is called a covering index.) If a query uses from a table only columns that are included in some index, the selected values can be retrieved from the index tree for greater speed:
Facts about Indexes:-
- Indexes are less important for queries on small tables, or big tables where report queries process most or all of the rows. When a query needs to access most of the rows, reading sequentially is faster than working through an index. Sequential reads minimize disk seeks, even if not all the rows are needed for the query.
- The best way to improve the performance of SELECT operations is to create indexes on one or more of the columns that are tested in the query.
- The index entries act like pointers to the table rows, allowing the query to quickly determine which rows match a condition in the WHERE clause, and retrieve the other column values for those rows.
- All MySQL data types can be indexed.
- Although it can be tempting to create an indexes for every possible column used in a query, unnecessary indexes waste space and waste time for MySQL to determine which indexes to use.
- Indexes also add to the cost of inserts, updates, and deletes because each index must be updated. You must find the right balance to achieve fast queries using the optimal set of indexes.
Verifying Index Usage
Always check whether all your queries really use the indexes that you have created in the tables. Use the EXPLAIN statement to verify index usage.
Finding duplicate indexes
Duplicate indexes will not necessarily slow down your select queries. However, they can slow down your insert and update queries and can cost you more disk space. In general, it’s better to avoid having duplicate keys.
Conclusion
Indexes can be very important for an optimal MySQL performance. Making sure that your indexes are in good shape is as important as adding them.
How to create Index:-
The CREATE INDEX statement is used to create indexes in tables.
SQL CREATE INDEX Syntax
Creates an index on a table. Duplicate values are allowed:
CREATE INDEX index_name
ON table_name (column_name)
SQL CREATE UNIQUE INDEX Syntax
Creates a unique index on a table. Duplicate values are not allowed:
CREATE UNIQUE INDEX index_name
ON table_name (column_name)
Note: The syntax for creating indexes varies amongst different databases. Therefore: Check the syntax for creating indexes in your database.
CREATE INDEX Example
The SQL statement below creates an index named “PIndex” on the “LastName” column in the “Persons” table:
CREATE INDEX PIndex
ON Persons (LastName)
If you want to create an index on a combination of columns, you can list the column names within the parentheses, separated by commas:
CREATE INDEX PIndex
ON Persons (LastName, FirstName)
- To find the rows matching a WHERE clause quickly.
- To eliminate rows from consideration. If there is a choice between multiple indexes, MySQL normally uses the index that finds the smallest number of rows (the most selective index).
- If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to look up rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3).
- To retrieve rows from other tables when performing joins. MySQL can use indexes on columns more efficiently if they are declared as the same type and size. In this context, VARCHAR and CHAR are considered the same if they are declared as the same size. For example, VARCHAR(10) and CHAR(10)are the same size, but VARCHAR(10) and CHAR(15) are not.
- For comparisons between nonbinary string columns, both columns should use the same character set. For example, comparing a utf8 column with alatin1 column precludes use of an index.
Comparison of dissimilar columns (comparing a string column to a temporal or numeric column, for example) may prevent use of indexes if values cannot be compared directly without conversion. For a given value such as 1 in the numeric column, it might compare equal to any number of values in the string column such as ‘1’, ‘ 1’, ‘00001’, or ’01.e1′. This rules out use of any indexes for the string column.
- To find the MIN() or MAX() value for a specific indexed column key_col. This is optimized by a preprocessor that checks whether you are using WHEREkey_part_N = constant on all key parts that occur before key_col in the index. In this case, MySQL does a single key lookup for each MIN() orMAX() expression and replaces it with a constant. If all expressions are replaced with constants, the query returns at once. For example:
- SELECT MIN(key_part2),MAX(key_part2) FROM tbl_name WHERE key_part1=10;
- To sort or group a table if the sorting or grouping is done on a leftmost prefix of a usable index (for example, ORDER BY key_part1, key_part2). If all key parts are followed by DESC, the key is read in reverse order.
- In some cases, a query can be optimized to retrieve values without consulting the data rows. (An index that provides all the necessary results for a query is called a covering index.) If a query uses from a table only columns that are included in some index, the selected values can be retrieved from the index tree for greater speed:
Pingback: Which is best: Partial Index and Expression Index: PostgreSQL - Eduguru