MySQL User Guide Part III Advanced MySQL Usage

Back to MySQL User Guide

Contents

[edit] Part III. Advanced MySQL Usage

[edit] Creating Schemas and Tables

[edit] Designing a Schema

[edit] Indexing Data

[edit] Introduction to Indexing *Complete*

Note: Experienced users may wish to skip this section.

An index in a relational database serves much the same purpose as an index in a book: books are not (typically) organized alphabetically by subject, but are instead organized into logical chapters and parts. This works well until you need to find all references to a specific subject.

When you find yourself in such a situation you look at the index. In the index, the contents of the book are listed in alphabetical order, with a page number displayed for each subject.

Rather than go through the book page by page, you can look a subject up in the index, then move to the page number in question and scan the page for the subject you are interested in.

In a relational database, rows are not stored in a table in any particular order. An index creates a sorted version of a column (or set of columns) from a table that can be searched more efficiently than the table itself.

Each entry in the index stores either the row itself (in the case of InnoDB), or a pointer to the row's location in the data file (similar to a page number in a book index).

Effective indexing can greatly improve the performance of SELECT queries. When a specific column value is searched for, and an appropriate index is not available, the MySQL server must read every row of the table looking for matching rows. When an appropriate index is available, the MySQL server can look in the index for the appropriate values. If you are querying only columns that appear in the index, the MySQL server skips reading the table itself and return data directly from the index.

This chapter covers the following topics:

[edit] Types of Indexes *Complete*

There are five different index types available for use in MySQL:

Your choice of index usually depends on whether a given column should contain unique values: if a column contains unique values and those values are considered to uniquely identify a row (such as ISBN, UPC, or SSN values), choose a PRIMARY index. If your values are unique, but are not considered to uniquely identify the row, choose a UNIQUE index. If you wish to improve the performance of queries that match values in a column, use a regular INDEX. Only one PRIMARY index is allowed per table, all other index types can occur multiple times in a single table.

In special cases you may need to use FULLTEXT and SPATIAL indexes. If you are searching columns that contain natural language, you should use a FULLTEXT index. An example of natural language would be the titles and bodies of a collection of articles, as opposed to simple textual data such as person or place names, which should be indexed using a regular INDEX. The SPATIAL index is for use with GIS data only.

[edit] Choosing Columns to Index *Complete*

To use indexes effectively, it is necessary to identify those queries that are executing slowly and are not using indexes, or columns which contain unique values. Slow queries can be identified by directly reviewing the queries you use or by using the MySQL Slow Query Log.

[edit] Reviewing Queries for Index Usage *Complete*

To identify queries that make good candidates for indexing, look at the WHERE, GROUP BY, and ORDER BY clauses of your queries.

For example, look at the following query performed against the sakila sample database:

SELECT last_name, first_name FROM actor WHERE last_name = 'Walken'

In this case the WHERE clause contains a reference to the last_name column of the actor table.

Here is another example:

 SELECT film.title, COUNT(inventory.inventory_id) AS Stock FROM film, inventory WHERE film.film_id = inventory.film_id   AND inventory.store_id = GROUP BY film.film_id 

In this case the film_id and store_id columns are candidates for indexing.

Once you have identified candidate columns, you need to evaluate how often the column is involved in a query: the most often a column is referenced in your various queries, the stronger a candidate it becomes for indexing.

Each index you add to a table will have a negative effect on the performance of INSERT, UPDATE and DELETE queries because not only does the row data need to be changed, the index information must also be updated for each affected index. Over-indexing can lead to performance loss.

Once you have identified your candidate columns you can check whether they are already indexed by displaying the existing indexes on a table.

[edit] Identifying Slow Queries with the Slow Query Log *Complete*

If you need to identify slow queries on a production MySQL server you may benefit from using the MySQL Slow Query Log. When the MySQL server is started with the Slow Query Log enabled, it writes all queries that take longer than a configurable number of seconds to a log file. The queries in the Slow Query Log can be further examined and optimized. For more information see [userguide-logfiles.html#logfiles-slow-query-log Section 25.3, "The Slow Query Log"]

[edit] Displaying Table Indexes *Complete*

Before creating a new index, it is important to know what indexes currently exist for a given table. Index information can be retrieved using either the SHOW syntax or through use of the INFORMATION_SCHEMA. INFORMATION_SCHEMA is a standard method for accessing information, while SHOW is an extension of the SQL standard. Index information can also be browsed using the MySQL GUI tools.

[edit] Displaying Table Indexes Using the SHOW Command *Complete*

The SHOW command can be used within the active schema to display the index information for a table:

mysql> USE sakila;
Database changed

mysql> SHOW INDEX FROM film\G
 *************************** 1. row ***************************
        Table: film
   Non_unique: 0
     Key_name: PRIMARY
 Seq_in_index: 1
  Column_name: film_id
    Collation: A
  Cardinality: 2
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
 *************************** 2. row ***************************
        Table: film
   Non_unique: 1
     Key_name: Title_Description_Fulltext
 Seq_in_index: 1
  Column_name: title
    Collation: NULL
  Cardinality: NULL
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: FULLTEXT
      Comment:
 *************************** 3. row ***************************
        Table: film
   Non_unique: 1
     Key_name: Title_Description_Fulltext
 Seq_in_index: 2
  Column_name: description
    Collation: NULL
  Cardinality: NULL
     Sub_part: NULL
       Packed: NULL
         Null: YES
   Index_type: FULLTEXT
      Comment:
 3 rows in set (0.00 sec)
 

The output of the SHOW command shows that there are two indexes on the film table: PRIMARY and Title_Description_Fulltext (as seen from the Key_name value).

The columns being indexed are listed in the Column_name field. In this case there are indexes on the film_id, title, and description columns.

The title and description columns form two parts of the Title_Description_Fulltext index, with the title column appearing before the description column in the index, according to the Seq_in_index field.

You can determine whether or not an index enforces uniqueness by the Non_unique field: 0 indicates that the index enforces uniqueness, 1 indicates that the index does not enforce uniqueness.

[edit] Displaying Table Indexes Using the INFORMATION_SCHEMA *Complete*

As an alternative to the SHOW command, users can query the STATISTICS table of the INFORMATION_SCHEMA.

mysql> SELECT * FROM INFORMATION_SCHEMA.STATISTICS
    -> WHERE TABLE_SCHEMA = 'sakila'
    ->   AND table_name = 'film'\G
 *************************** 1. row ***************************
 TABLE_CATALOG: NULL
  TABLE_SCHEMA: sakila
    TABLE_NAME: film
    NON_UNIQUE: 0
  INDEX_SCHEMA: sakila
    INDEX_NAME: PRIMARY
  SEQ_IN_INDEX: 1
   COLUMN_NAME: film_id
     COLLATION: A
   CARDINALITY: 2
      SUB_PART: NULL
        PACKED: NULL
      NULLABLE:
    INDEX_TYPE: BTREE
       COMMENT:
 *************************** 2. row ***************************
 TABLE_CATALOG: NULL
  TABLE_SCHEMA: sakila
    TABLE_NAME: film
    NON_UNIQUE: 1
  INDEX_SCHEMA: sakila
    INDEX_NAME: Title_Description_Fulltext
  SEQ_IN_INDEX: 1
   COLUMN_NAME: title
     COLLATION: NULL
   CARDINALITY: NULL
      SUB_PART: NULL
        PACKED: NULL
      NULLABLE:
    INDEX_TYPE: FULLTEXT
       COMMENT:
 *************************** 3. row ***************************
 TABLE_CATALOG: NULL
  TABLE_SCHEMA: sakila
    TABLE_NAME: film
    NON_UNIQUE: 1
  INDEX_SCHEMA: sakila
    INDEX_NAME: Title_Description_Fulltext
  SEQ_IN_INDEX: 2
   COLUMN_NAME: description
     COLLATION: NULL
   CARDINALITY: NULL
      SUB_PART: NULL
        PACKED: NULL
      NULLABLE: YES
    INDEX_TYPE: FULLTEXT
       COMMENT:
 3 rows in set (0.00 sec)
   

The output of a query to the INFORMATION_SCHEMA closely matches that of the SHOW statement. For information on interpreting the output of a query on the INFORMATION_SCHEMA, please see [indexing.html#indexing-displaying-show Section 15.4.1, "Displaying Table Indexes Using the SHOW Command"].

One advantage of the INFORMATION_SCHEMA is that you can view the index information of more than one table at a time by modifying the TABLE_NAME portion of the WHERE clause of your query.

[edit] Displaying Index Information Using MySQL Administrator *Complete*

Users can also view index information using MySQL Administrator. To view index information, select the desired schema in the Catalogs screen, then select the Schema Indices tab:

Image:Ug-Indexing-displaying-administrator.png

The Catalog screen displays the index information for all tables in the selected schema. To view the columns that make up a given index, click the arrow icon to the left of the index name.

[edit] Creating Indexes *Complete*

Indexes can be created using either a CREATE TABLE, CREATE INDEX or ALTER TABLE syntax. Before creating an index, you must know which table the index will be added to, which column(s) the index will apply to, the type of index you will create, and whether the index will enforce uniqueness.

[edit] Creating Indexes with the CREATE TABLE Statement *Complete*

Indexes can be created during table creation by specifying the index information as part of the CREATE TABLE statement, specifying the index information either as part of a column information or at the end of the column definitions.

At minimum, the PRIMARY KEY and UNIQUE indexes should be specified at table creation to prevent duplicate key issues from occurring when the indexes added later (if you try to add such an index after your table is populated, you may have to remove duplicate rows manually before the indexes can be created).

For example, this is a simplified version of the CREATE TABLE statement for the inventory table:

 CREATE TABLE inventory (   inventory_id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,   film_id INT UNSIGNED NOT NULL,   store_id INT UNSIGNED NOT NULL,   INDEX store_id_index (store_id) ) 

The primary key was created as part of the column creation line, while the index on the store_id column was created at the end of the CREATE TABLE statement.

It is necessary to specify index information at the end of the CREATE TABLE statement if the index contains more than one column. For example, in the film_actor table a single actor cannot appear more than once in the same film, so the primary key is a combination of the film_id and actor_id columns:

 CREATE TABLE film_actor (   actor_id INT UNSIGNED NOT NULL,   film_id INT UNSIGNED NOT NULL,   PRIMARY KEY  (film_id, actor_id) ) 

[edit] Using the CREATE INDEX Syntax *Complete*

To create a table with the CREATE INDEX syntax, you specify the type of index, a name for the index, the table to create the index on, and a list of the columns that form the index. CREATE INDEX is a non-standard alternative syntax to ALTER TABLE, described in the section that follows.

For example, to create an index on the last_name and first_name columns of the actor table, you would execute the following statement:

 CREATE INDEX last_first_name ON actor (last_name, first_name)

This next example creates an index that enforces uniqueness on the manager column of the store, ensuring no employee is so overworked that they have to manage two locations:

CREATE UNIQUE INDEX unique_manager ON store (manager)

The index types can be INDEX, UNIQUE INDEX, SPATIAL INDEX, and FULLTEXT INDEX.

[edit] Using ALTER TABLE to Create Indexes *Complete*

You can use the ALTER TABLE statement to create indexes on existing tables. The benefit of ALTER TABLE is that it can be used to create multiple indexes in a single statement, which can speed index creation when multiple indexes are needed.

For example, to create an index on last_name and first_name columns of the actor table, you would execute the following statement:

ALTER TABLE actor ADD INDEX last_first_name (last_name, first_name)

This example creates an index that enforces uniqueness on the manager column of the store:

ALTER TABLE actor ADD UNIQUE unique_manager (manager)

Here is an example of creating a primary key on the staff table:

ALTER TABLE staff ADD PRIMARY KEY (staff_id)

Primary keys can only be created with CREATE TABLE and ALTER TABLE statements, there is no CREATE INDEX equivalent for primary keys.

Multiple indexes can be created with a single ALTER TABLE statement by separating the ADD statements with commas:

  ALTER TABLE actor ADD UNIQUE unique_manager (manager), ADD PRIMARY KEY (staff_id)  

Other index types you can add include ADD SPATIAL and ADD FULLTEXT.

[edit] Indexing the Prefix of a Column *Complete*

It is possible to index only a prefix of a VARCHAR, CHAR or TEXT column by placing the size of the prefix (in characters) within brackets after the column name:

 CREATE INDEX lname ON actor (last_name(5))

Indexing the prefix of a column decreases the size of the index on disk compared to indexing the entire column, which in turn increases the performance of the index. An index can prefix up to 1000 bytes of a column in MyISAM, 767 bytes in InnoDB, and 255 bytes for all other storage engines.

One way to find the proper prefix size for a column is to perform the following SELECT query:

  SELECT COUNT(DISTINCT <code>column_name) AS distinct_rows,  COUNT(DISTINCT(LEFT(column_name, N))) AS prefix_distinct   FROM table_name </code>

Start with a N value of 3 and increase the size of N until the value of prefix_distinct nears that of distinct_rows.

[edit] Limitations of Index Prefixing *Complete*

When indexing a TEXT or a BLOB column you must specify a prefix size, so the use of prefixes is optional for CHAR, VARCHAR, BINARY, and VARBINARY columns only.

A prefixed index lname as described in [indexing.html#indexing-creating-prefix Section 15.5.4, "Indexing the Prefix of a Column"] decreases the size of the index file and performs reasonably well when doing lookups. For example:

 SELECT * FROM actor WHERE last_name = 'Depp';

However, prefixed indexes aren't very useful even for looking up rows if values have low cardinality in the prefixes. For example, if you prefix the first three characters of abcd, abce, abcf, and abcg, the prefix values are identical and do not distiguish rows.

Further, prefixed indexes are sometimes ignored when used for the following operations:

Additionally, if two tables are joined on columns that use prefix indexing, the index may be ignored and result in a full table scan. The optimizer is more often able to make use of full keys than prefix key values, so, in general it is safer to user a full key, especially if you are not sure exactly how your tables will be queried. If optimization is a major concern use an EXPLAIN statement to determine whether prefixed indexes are being used.

[edit] Creating and Using Composite Indexes *Complete*

When executing a SELECT query, the MySQL server typically uses only one index per table involved in the query. If the WHERE clause of the query references more than one column, a single-column index may be less than optimal. For example, say you were executing the following query:

 SELECT actor_id  FROM actor  WHERE last_name = 'Johnson'   AND first_name = 'Robert' 

If the table had an index on the last_name column, the index could be used to narrow the table down to all actors with the last name Johnson, but MySQL would still have to scan all the matched rows to find actors with the first name Robert.

By using a composite index, or an index on multiple columns, the preceding query could be fully optimized. Here is an example of a composite index on the actor table:

CREATE INDEX last_first_name ON actor (last_name, first_name)

With such a composite index, MySQL can first find the last name Johnson in the table, then search for the first name Robert in the matching index entries.

Composite indexes can also be partially used when the columns in the WHERE clause of a query appear in the left-most part of the composite index. For instance, the following query would make use of the composite index we have created:

 SELECT first_name  FROM actor  WHERE last_name = 'Johnson' 

However, the following query would not make use of our composite index:

 SELECT last_name  FROM actor  WHERE first_name = 'Robert' 

The second example does not make use of the composite index because the first_name column is not the left-most part of the index columns.

This rule applies no matter how many parts a composite index has; if you have an index on (columnA, columnB, columnC, columnD), the index will be used on queries that contain the following columns in the WHERE clause: (columnA), (columnA, columnB),(columnA, columnB, columnC), and (columnA, columnB, columnC, columnD). You would not be able to create any queries without columnA and expect the composite index to be used.

For a more detailed description on when an index will be used, see the section titled How MySQL Uses Indexes in the MySQL Reference Manual.

[edit] Dropping Indexes *Complete*

Existing indexes can be dropped using either a DROP INDEX or ALTER TABLE syntax:

DROP INDEX <code>index_name ON table_name</code>
ALTER TABLE <code>table_name DROP PRIMARY KEY</code>
ALTER TABLE <code>table_name DROP INDEX index_name</code>

You can drop multiple indexes in a single ALTER TABLE statement by separating them with commas:

 ALTER TABLE actor DROP PRIMARY KEY, DROP INDEX last_first_name

[edit] Using FULLTEXT Indexes *Complete*

While regular indexes are effective for many purposes, they are not effective for columns that contain natural language. An index can be used for single word CHAR and VARCHAR columns, but FULLTEXT indexes are designed for finding strings within larger natural language fields.

A FULLTEXT search takes a string and column list and searches the specified columns for the string, returning results ranked by relevancy.

The FULLTEXT index is available for the MyISAM storage engine only.

The syntax for creating a FULLTEXT index is listed in [indexing.html#indexing-creating Section 15.5, "Creating Indexes"]. Once the index is created, the MATCH ... AGAINST syntax can be used to perform FULLTEXT queries.

The MATCH clause indicates which columns are to be searched. The list of columns in the MATCH clause must be identical to the list of columns in the FULLTEXT index.

The AGAINST clause contains the string being searched for. The string in the AGAINST clause must be a constant string: you cannot use a user variable or search result in the AGAINST clause.

Here is an example of a basic FULLTEXT query that searches for movies in the film table that contain the word army in the title or description columns:

 SELECT title, description FROM film WHERE MATCH (title, description) AGAINST ('army')
 
 *************************** 1. row ***************************
            title: ARMY FLINTSTONES
      description: A Boring Saga of a Database Administrator And a Womanizer who
 must Battle a Waitress in Nigeria
 1 row in set (0.00 sec)
 

Results from a query with MATCH ... AGAINST in the WHERE clause will always return in descending order based on relevancy.

Here is the same query performed with a LIKE clause instead:

 SELECT title, description FROM film WHERE title LIKE '%army%' OR description LIKE '%army%'
 
 *************************** 1. row ***************************
            title: ARMY FLINTSTONES
      description: A Boring Saga of a Database Administrator And a Womanizer who
 must Battle a Waitress in Nigeria
 1 row in set (0.20 sec)
 

Note the performance improvement provided by the FULLTEXT index.

The MATCH ... AGAINST syntax can also provide relevancy ranking information:

   SELECT title, description, MATCH (title, description) AGAINST ('army') AS rank  FROM film  WHERE MATCH (title, description) AGAINST ('army')
 
 *************************** 1. row ***************************
       title: ARMY FLINTSTONES
 description: A Boring Saga of a Database Administrator And a Womanizer who
 must Battle a Waitress in Nigeria
        rank: 6.1943987015493
 1 row in set (0.00 sec)
 

The relevancy scores are based on the weighting of words within the individual rows. Words that occur rarely in the table are ranked higher than words that appear in a large percentage of the rows.

For more information on the FULLTEXT search engine, see the Fulltext Search section of the MySQL Reference Manual.

[edit] Using EXPLAIN to Optimize Indexing *Complete*

Sometimes it is not easy to identify which columns of a table to index, even when you have identified the slow queries in your application. The EXPLAIN statement is designed to assist in the query optimization process by providing insight into how the MySQL optimizer handles a specific query.

To analyze a query, precede the query with the EXPLAIN keyword:

   EXPLAIN SELECT film.title FROM actor, film, film_actor  WHERE actor.actor_id = film_actor.actor_id    AND film.film_id = film_actor.film_id    AND actor.last_name = 'Walken'\G  
 *************************** 1. row ***************************
            id: 1
   select_type: SIMPLE
         table: actor
          type: ALL
 possible_keys: PRIMARY
           key: NULL
       key_len: NULL
           ref: NULL
          rows: 200
         Extra: Using where
 *************************** 2. row ***************************
            id: 1
   select_type: SIMPLE
         table: film_actor
          type: ref
 possible_keys: PRIMARY
           key: PRIMARY
       key_len: 4
           ref: sakila.actor.actor_id
          rows: 26
         Extra: Using index
 *************************** 3. row ***************************
            id: 1
   select_type: SIMPLE
         table: film
          type: eq_ref
 possible_keys: PRIMARY
           key: PRIMARY
       key_len: 4
           ref: sakila.film_actor.film_id
          rows: 1
         Extra:
 3 rows in set (0.00 sec)
 

EXPLAIN returns a row of information for each table used in the SELECT statement. The tables are listed in the output in the order that MySQL would read them while processing the query.

The main things to look out for are rows where the key column is NULL, where the type column is range, index, or ALL, or where the Extra column contains Using filesort or Using temporary. Such queries should be closely examined for proper index usage as they generally indicate that no index is being used.

For additional information on using the EXPLAIN statement, see the EXPLAIN section of the MySQL Reference Manual.

[edit] MySQL Views

[edit] What is a Database View?

[edit] Triggers

[edit] What Are Triggers?

[edit] MySQL Stored Procedures

[edit] What is a Stored Procedure?

[edit] MySQL Storage Engines

[edit] Engine

[edit] Optimization

[edit] Using EXPLAIN

[edit] Optimizing SELECT Statements

[edit] Optimizing Indexes

Retrieved from "http://forge.mysql.com/wiki/MySQL_User_Guide_Part_III_Advanced_MySQL_Usage"

This page has been accessed 14,735 times. This page was last modified 21:51, 24 January 2011.

Find

Browse
MySQLForge
Main Page
Current events
Recent changes
Random page
Help
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Special pages
New pages
File list
Statistics
Bug reports
More...