Buffering and Caching - MySQL Manual

MySQL Manual

Daftar Isi ☛

(Sebelumnya) 8.5. Optimizing for InnoDB Tables

8.12. Measuring Performance (B ... (Berikutnya)

8.9. Buffering and Caching

8.9.1. The InnoDB Buffer Pool
8.9.2. The MyISAM Key Cache
8.9.3. The MySQL Query Cache

MySQL uses several strategies that cache information in memory buffers to increase performance. Depending on your database architecture, you balance the size and layout of these areas, to provide the most performance benefit without wasting memory or exceeding available memory. When you set up or resize these memory areas, test the resulting performance using the techniques from Section 8.12, "Measuring Performance (Benchmarking)".

8.9.1. The InnoDB Buffer Pool

InnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. Knowing how the InnoDB buffer pool works, and taking advantage of it to keep frequently accessed data in memory, is an important aspect of MySQL tuning.

Guidelines

Ideally, you set the size of the buffer pool to as large a value as practical, leaving enough memory for other processes on the server to run without excessive paging. The larger the buffer pool, the more InnoDB acts like an in-memory database, reading data from disk once and then accessing the data from memory during subsequent reads. The buffer pool even caches data changed by insert and update operations, so that disk writes can be grouped together for better performance.

Depending on the typical workload on your system, you might adjust the proportions of the parts within the buffer pool. You can tune the way the buffer pool chooses which blocks to cache once it fills up, to keep frequently accessed data in memory despite sudden spikes of activity for operations such as backups or reporting.

With 64-bit systems with large memory sizes, you can split the buffer pool into multiple parts, to minimize contention for the memory structures among concurrent operations. For details, see Section 14.4.7.18, "Improvements to Performance from Multiple Buffer Pools".

Internal Details

InnoDB manages the pool as a list, using a variation of the least recently used (LRU) algorithm. When room is needed to add a new block to the pool, InnoDB evicts the least recently used block and adds the new block to the middle of the list. This "midpoint insertion strategy" treats the list as two sublists:

At the head, a sublist of "new" (or "young") blocks that were accessed recently.
At the tail, a sublist of "old" blocks that were accessed less recently.

This algorithm keeps blocks that are heavily used by queries in the new sublist. The old sublist contains less-used blocks; these blocks are candidates for eviction.

The LRU algorithm operates as follows by default:

3/8 of the buffer pool is devoted to the old sublist.
The midpoint of the list is the boundary where the tail of the new sublist meets the head of the old sublist.
When InnoDB reads a block into the buffer pool, it initially inserts it at the midpoint (the head of the old sublist). A block can be read in because it is required for a user-specified operation such as an SQL query, or as part of a read-ahead operation performed automatically by InnoDB.
Accessing to a block in the old sublist makes it "young", moving it to the head of the buffer pool (the head of the new sublist). If the block was read in because it was required, the first access occurs immediately and the block is made young. If the block was read in due to read-ahead, the first access does not occur immediately (and might not occur at all before the block is evicted).
As the database operates, blocks in the buffer pool that are not accessed "age" by moving toward the tail of the list. Blocks in both the new and old sublists age as other blocks are made new. Blocks in the old sublist also age as blocks are inserted at the midpoint. Eventually, a block that remains unused for long enough reaches the tail of the old sublist and is evicted.

By default, blocks read by queries immediately move into the new sublist, meaning they will stay in the buffer pool for a long time. A table scan (such as performed for a mysqldump operation, or a SELECT statement with no WHERE clause) can bring a large amount of data into the buffer pool and evict an equivalent amount of older data, even if the new data is never used again. Similarly, blocks that are loaded by the read-ahead background thread and then accessed only once move to the head of the new list. These situations can push frequently used blocks to the old sublist, where they become subject to eviction.

Configuration Options

Several InnoDB system variables control the size of the buffer pool and let you tune the LRU algorithm:

innodb_buffer_pool_size
Specifies the size of the buffer pool. If your buffer pool is small and you have sufficient memory, making the pool larger can improve performance by reducing the amount of disk I/O needed as queries access InnoDB tables.
innodb_buffer_pool_instances
Divides the buffer pool into a user-specified number of separate regions, each with its own LRU list and related data structures, to reduce contention during concurrent memory read and write operations. This option takes effect only when you set the innodb_buffer_pool_size to a size of 1 gigabyte or more. The total size you specify is divided among all the buffer pools. For best efficiency, specify a combination of innodb_buffer_pool_instances and innodb_buffer_pool_size so that each buffer pool instance is at least 1 gigabyte.
innodb_old_blocks_pct
Specifies the approximate percentage of the buffer pool that InnoDB uses for the old block sublist. The range of values is 5 to 95. The default value is 37 (that is, 3/8 of the pool).
innodb_old_blocks_time
Specifies how long in milliseconds (ms) a block inserted into the old sublist must stay there after its first access before it can be moved to the new sublist. The default value is 0: A block inserted into the old sublist moves immediately to the new sublist the first time it is accessed, no matter how soon after insertion the access occurs. If the value is greater than 0, blocks remain in the old sublist until an access occurs at least that many ms after the first access. For example, a value of 1000 causes blocks to stay in the old sublist for 1 second after the first access before they become eligible to move to the new sublist.

Setting innodb_old_blocks_time greater than 0 prevents one-time table scans from flooding the new sublist with blocks used only for the scan. Rows in a block read in for a scan are accessed many times in rapid succession, but the block is unused after that. If innodb_old_blocks_time is set to a value greater than time to process the block, the block remains in the "old" sublist and ages to the tail of the list to be evicted quickly. This way, blocks used only for a one-time scan do not act to the detriment of heavily used blocks in the new sublist.

innodb_old_blocks_time can be set at runtime, so you can change it temporarily while performing operations such as table scans and dumps:

SET GLOBAL innodb_old_blocks_time = 1000;... perform queries that scan tables ...SET GLOBAL innodb_old_blocks_time = 0;

This strategy does not apply if your intent is to "warm up" the buffer pool by filling it with a table's content. For example, benchmark tests often perform a table or index scan at server startup, because that data would normally be in the buffer pool after a period of normal use. In this case, leave innodb_old_blocks_time set to 0, at least until the warmup phase is complete.

Monitoring the Buffer Pool

The output from the InnoDB Standard Monitor contains several fields in the BUFFER POOL AND MEMORY section that pertain to operation of the buffer pool LRU algorithm:

Old database pages: The number of pages in the old sublist of the buffer pool.
Pages made young, not young: The number of old pages that were moved to the head of the buffer pool (the new sublist), and the number of pages that have remained in the old sublist without being made new.
youngs/s non-youngs/s: The number of accesses to old pages that have resulted in making them young or not. This metric differs from that of the previous item in two ways. First, it relates only to old pages. Second, it is based on number of accesses to pages and not the number of pages. (There can be multiple accesses to a given page, all of which are counted.)
young-making rate: Hits that cause blocks to move to the head of the buffer pool.
not: Hits that do not cause blocks to move to the head of the buffer pool (due to the delay not being met).

The young-making rate and not rate will not normally add up to the overall buffer pool hit rate. Hits for blocks in the old sublist cause them to move to the new sublist, but hits to blocks in the new sublist cause them to move to the head of the list only if they are a certain distance from the head.

The preceding information from the Monitor can help you make LRU tuning decisions:

If you see very low youngs/s values when you do not have large scans going on, that indicates that you might need to either reduce the delay time, or increase the percentage of the buffer pool used for the old sublist. Increasing the percentage makes the old sublist larger, so blocks in that sublist take longer to move to the tail and be evicted. This increases the likelihood that they will be accessed again and be made young.
If you do not see a lot of non-youngs/s when you are doing large table scans (and lots of youngs/s), to tune your delay value to be larger.

For more information about InnoDB Monitors, see Section 14.3.14.2, "SHOW ENGINE INNODB STATUS and the InnoDB Monitors".

8.9.2. The MyISAM Key Cache

8.9.2.1. Shared Key Cache Access
8.9.2.2. Multiple Key Caches
8.9.2.3. Midpoint Insertion Strategy
8.9.2.4. Index Preloading
8.9.2.5. Key Cache Block Size
8.9.2.6. Restructuring a Key Cache

To minimize disk I/O, the MyISAM storage engine exploits a strategy that is used by many database management systems. It employs a cache mechanism to keep the most frequently accessed table blocks in memory:

For index blocks, a special structure called the key cache (or key buffer) is maintained. The structure contains a number of block buffers where the most-used index blocks are placed.
For data blocks, MySQL uses no special cache. Instead it relies on the native operating system file system cache.

This section first describes the basic operation of the MyISAM key cache. Then it discusses features that improve key cache performance and that enable you to better control cache operation:

Multiple sessions can access the cache concurrently.
You can set up multiple key caches and assign table indexes to specific caches.

To control the size of the key cache, use the key_buffer_size system variable. If this variable is set equal to zero, no key cache is used. The key cache also is not used if the key_buffer_size value is too small to allocate the minimal number of block buffers (8).

When the key cache is not operational, index files are accessed using only the native file system buffering provided by the operating system. (In other words, table index blocks are accessed using the same strategy as that employed for table data blocks.)

An index block is a contiguous unit of access to the MyISAM index files. Usually the size of an index block is equal to the size of nodes of the index B-tree. (Indexes are represented on disk using a B-tree data structure. Nodes at the bottom of the tree are leaf nodes. Nodes above the leaf nodes are nonleaf nodes.)

All block buffers in a key cache structure are the same size. This size can be equal to, greater than, or less than the size of a table index block. Usually one these two values is a multiple of the other.

When data from any table index block must be accessed, the server first checks whether it is available in some block buffer of the key cache. If it is, the server accesses data in the key cache rather than on disk. That is, it reads from the cache or writes into it rather than reading from or writing to disk. Otherwise, the server chooses a cache block buffer containing a different table index block (or blocks) and replaces the data there by a copy of required table index block. As soon as the new index block is in the cache, the index data can be accessed.

If it happens that a block selected for replacement has been modified, the block is considered "dirty." In this case, prior to being replaced, its contents are flushed to the table index from which it came.

Usually the server follows an LRU (Least Recently Used) strategy: When choosing a block for replacement, it selects the least recently used index block. To make this choice easier, the key cache module maintains all used blocks in a special list (LRU chain) ordered by time of use. When a block is accessed, it is the most recently used and is placed at the end of the list. When blocks need to be replaced, blocks at the beginning of the list are the least recently used and become the first candidates for eviction.

The InnoDB storage engine also uses an LRU algorithm, to manage its buffer pool. See Section 8.9.1, "The InnoDB Buffer Pool".

8.9.2.1. Shared Key Cache Access

Threads can access key cache buffers simultaneously, subject to the following conditions:

A buffer that is not being updated can be accessed by multiple sessions.
A buffer that is being updated causes sessions that need to use it to wait until the update is complete.
Multiple sessions can initiate requests that result in cache block replacements, as long as they do not interfere with each other (that is, as long as they need different index blocks, and thus cause different cache blocks to be replaced).

Shared access to the key cache enables the server to improve throughput significantly.

8.9.2.2. Multiple Key Caches

Shared access to the key cache improves performance but does not eliminate contention among sessions entirely. They still compete for control structures that manage access to the key cache buffers. To reduce key cache access contention further, MySQL also provides multiple key caches. This feature enables you to assign different table indexes to different key caches.

Where there are multiple key caches, the server must know which cache to use when processing queries for a given MyISAM table. By default, all MyISAM table indexes are cached in the default key cache. To assign table indexes to a specific key cache, use the CACHE INDEX statement (see Section 13.7.6.2, "CACHE INDEX Syntax"). For example, the following statement assigns indexes from the tables t1, t2, and t3 to the key cache named hot_cache:

mysql> CACHE INDEX t1, t2, t3 IN hot_cache;+---------+--------------------+----------+----------+| Table   | Op | Msg_type | Msg_text |+---------+--------------------+----------+----------+| test.t1 | assign_to_keycache | status   | OK   || test.t2 | assign_to_keycache | status   | OK   || test.t3 | assign_to_keycache | status   | OK   |+---------+--------------------+----------+----------+

The key cache referred to in a CACHE INDEX statement can be created by setting its size with a SET GLOBAL parameter setting statement or by using server startup options. For example:

mysql> SET GLOBAL keycache1.key_buffer_size=128*1024;

To destroy a key cache, set its size to zero:

mysql> SET GLOBAL keycache1.key_buffer_size=0;

Note that you cannot destroy the default key cache. Any attempt to do this will be ignored:

mysql> SET GLOBAL key_buffer_size = 0;mysql> SHOW VARIABLES LIKE 'key_buffer_size';+-----------------+---------+| Variable_name   | Value   |+-----------------+---------+| key_buffer_size | 8384512 |+-----------------+---------+

Key cache variables are structured system variables that have a name and components. For keycache1.key_buffer_size, keycache1 is the cache variable name and key_buffer_size is the cache component. See Section 5.1.5.1, "Structured System Variables", for a description of the syntax used for referring to structured key cache system variables.

By default, table indexes are assigned to the main (default) key cache created at the server startup. When a key cache is destroyed, all indexes assigned to it are reassigned to the default key cache.

For a busy server, you can use a strategy that involves three key caches:

A "hot" key cache that takes up 20% of the space allocated for all key caches. Use this for tables that are heavily used for searches but that are not updated.
A "cold" key cache that takes up 20% of the space allocated for all key caches. Use this cache for medium-sized, intensively modified tables, such as temporary tables.
A "warm" key cache that takes up 60% of the key cache space. Employ this as the default key cache, to be used by default for all other tables.

One reason the use of three key caches is beneficial is that access to one key cache structure does not block access to the others. Statements that access tables assigned to one cache do not compete with statements that access tables assigned to another cache. Performance gains occur for other reasons as well:

The hot cache is used only for retrieval queries, so its contents are never modified. Consequently, whenever an index block needs to be pulled in from disk, the contents of the cache block chosen for replacement need not be flushed first.
For an index assigned to the hot cache, if there are no queries requiring an index scan, there is a high probability that the index blocks corresponding to nonleaf nodes of the index B-tree remain in the cache.
An update operation most frequently executed for temporary tables is performed much faster when the updated node is in the cache and need not be read in from disk first. If the size of the indexes of the temporary tables are comparable with the size of cold key cache, the probability is very high that the updated node is in the cache.

The CACHE INDEX statement sets up an association between a table and a key cache, but the association is lost each time the server restarts. If you want the association to take effect each time the server starts, one way to accomplish this is to use an option file: Include variable settings that configure your key caches, and an init-file option that names a file containing CACHE INDEX statements to be executed. For example:

key_buffer_size = 4Ghot_cache.key_buffer_size = 2Gcold_cache.key_buffer_size = 2Ginit_file=/path/to/data-directory/mysqld_init.sql

The statements in mysqld_init.sql are executed each time the server starts. The file should contain one SQL statement per line. The following example assigns several tables each to hot_cache and cold_cache:

CACHE INDEX db1.t1, db1.t2, db2.t3 IN hot_cacheCACHE INDEX db1.t4, db2.t5, db2.t6 IN cold_cache

8.9.2.3. Midpoint Insertion Strategy

By default, the key cache management system uses a simple LRU strategy for choosing key cache blocks to be evicted, but it also supports a more sophisticated method called the midpoint insertion strategy.

When using the midpoint insertion strategy, the LRU chain is divided into two parts: a hot sublist and a warm sublist. The division point between two parts is not fixed, but the key cache management system takes care that the warm part is not "too short," always containing at least key_cache_division_limit percent of the key cache blocks. key_cache_division_limit is a component of structured key cache variables, so its value is a parameter that can be set per cache.

When an index block is read from a table into the key cache, it is placed at the end of the warm sublist. After a certain number of hits (accesses of the block), it is promoted to the hot sublist. At present, the number of hits required to promote a block (3) is the same for all index blocks.

A block promoted into the hot sublist is placed at the end of the list. The block then circulates within this sublist. If the block stays at the beginning of the sublist for a long enough time, it is demoted to the warm sublist. This time is determined by the value of the key_cache_age_threshold component of the key cache.

The threshold value prescribes that, for a key cache containing N blocks, the block at the beginning of the hot sublist not accessed within the last N * key_cache_age_threshold / 100 hits is to be moved to the beginning of the warm sublist. It then becomes the first candidate for eviction, because blocks for replacement always are taken from the beginning of the warm sublist.

The midpoint insertion strategy enables you to keep more-valued blocks always in the cache. If you prefer to use the plain LRU strategy, leave the key_cache_division_limit value set to its default of 100.

The midpoint insertion strategy helps to improve performance when execution of a query that requires an index scan effectively pushes out of the cache all the index blocks corresponding to valuable high-level B-tree nodes. To avoid this, you must use a midpoint insertion strategy with the key_cache_division_limit set to much less than 100. Then valuable frequently hit nodes are preserved in the hot sublist during an index scan operation as well.

8.9.2.4. Index Preloading

If there are enough blocks in a key cache to hold blocks of an entire index, or at least the blocks corresponding to its nonleaf nodes, it makes sense to preload the key cache with index blocks before starting to use it. Preloading enables you to put the table index blocks into a key cache buffer in the most efficient way: by reading the index blocks from disk sequentially.

Without preloading, the blocks are still placed into the key cache as needed by queries. Although the blocks will stay in the cache, because there are enough buffers for all of them, they are fetched from disk in random order, and not sequentially.

To preload an index into a cache, use the LOAD INDEX INTO CACHE statement. For example, the following statement preloads nodes (index blocks) of indexes of the tables t1 and t2:

mysql> LOAD INDEX INTO CACHE t1, t2 IGNORE LEAVES;+---------+--------------+----------+----------+| Table   | Op   | Msg_type | Msg_text |+---------+--------------+----------+----------+| test.t1 | preload_keys | status   | OK   || test.t2 | preload_keys | status   | OK   |+---------+--------------+----------+----------+

The IGNORE LEAVES modifier causes only blocks for the nonleaf nodes of the index to be preloaded. Thus, the statement shown preloads all index blocks from t1, but only blocks for the nonleaf nodes from t2.

If an index has been assigned to a key cache using a CACHE INDEX statement, preloading places index blocks into that cache. Otherwise, the index is loaded into the default key cache.

8.9.2.5. Key Cache Block Size

It is possible to specify the size of the block buffers for an individual key cache using the key_cache_block_size variable. This permits tuning of the performance of I/O operations for index files.

The best performance for I/O operations is achieved when the size of read buffers is equal to the size of the native operating system I/O buffers. But setting the size of key nodes equal to the size of the I/O buffer does not always ensure the best overall performance. When reading the big leaf nodes, the server pulls in a lot of unnecessary data, effectively preventing reading other leaf nodes.

To control the size of blocks in the .MYI index file of MyISAM tables, use the --myisam-block-size option at server startup.

8.9.2.6. Restructuring a Key Cache

A key cache can be restructured at any time by updating its parameter values. For example:

mysql> SET GLOBAL cold_cache.key_buffer_size=4*1024*1024;

If you assign to either the key_buffer_size or key_cache_block_size key cache component a value that differs from the component's current value, the server destroys the cache's old structure and creates a new one based on the new values. If the cache contains any dirty blocks, the server saves them to disk before destroying and re-creating the cache. Restructuring does not occur if you change other key cache parameters.

When restructuring a key cache, the server first flushes the contents of any dirty buffers to disk. After that, the cache contents become unavailable. However, restructuring does not block queries that need to use indexes assigned to the cache. Instead, the server directly accesses the table indexes using native file system caching. File system caching is not as efficient as using a key cache, so although queries execute, a slowdown can be anticipated. After the cache has been restructured, it becomes available again for caching indexes assigned to it, and the use of file system caching for the indexes ceases.

8.9.3. The MySQL Query Cache

8.9.3.1. How the Query Cache Operates
8.9.3.2. Query Cache SELECT Options
8.9.3.3. Query Cache Configuration
8.9.3.4. Query Cache Status and Maintenance

The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. The query cache is shared among sessions, so a result set generated by one client can be sent in response to the same query issued by another client.

The query cache can be useful in an environment where you have tables that do not change very often and for which the server receives many identical queries. This is a typical situation for many Web servers that generate many dynamic pages based on database content. For example, when an order form queries a table to display the lists of all US states or all countries in the world, those values can be retrieved from the query cache. Although the values would probably be retrieved from memory in any case (from the InnoDB buffer pool or MyISAM key cache), using the query cache avoids the overhead of processing the query, deciding whether to use a table scan, and locating the data block for each row.

The query cache always contains current and reliable data. Any insert, update, delete, or other modification to a table causes any relevant entries in the query cache to be flushed.

Note

The query cache does not work in an environment where you have multiple mysqld servers updating the same MyISAM tables.

The query cache is used for prepared statements under the conditions described in Section 8.9.3.1, "How the Query Cache Operates".

Note

As of MySQL 5.5.23, the query cache is not supported for partitioned tables, and is automatically disabled for queries involving partitioned tables. The query cache cannot be enabled for such queries. (Bug #53775)

Some performance data for the query cache follows. These results were generated by running the MySQL benchmark suite on a Linux Alpha 2�500MHz system with 2GB RAM and a 64MB query cache.

If all the queries you are performing are simple (such as selecting a row from a table with one row), but still differ so that the queries cannot be cached, the overhead for having the query cache active is 13%. This could be regarded as the worst case scenario. In real life, queries tend to be much more complicated, so the overhead normally is significantly lower.
Searches for a single row in a single-row table are 238% faster with the query cache than without it. This can be regarded as close to the minimum speedup to be expected for a query that is cached.

To disable the query cache at server startup, set the query_cache_size system variable to 0. By disabling the query cache code, there is no noticeable overhead.

The query cache offers the potential for substantial performance improvement, but do not assume that it will do so under all circumstances. With some query cache configurations or server workloads, you might actually see a performance decrease:

Be cautious about sizing the query cache excessively large, which increases the overhead required to maintain the cache, possibly beyond the benefit of enabling it. Sizes in tens of megabytes are usually beneficial. Sizes in the hundreds of megabytes might not be.
Server workload has a significant effect on query cache efficiency. A query mix consisting almost entirely of a fixed set of SELECT statements is much more likely to benefit from enabling the cache than a mix in which frequent INSERT statements cause continual invalidation of results in the cache. In some cases, a workaround is to use the SQL_NO_CACHE option to prevent results from even entering the cache for SELECT statements that use frequently modified tables. (See Section 8.9.3.2, "Query Cache SELECT Options".)

To verify that enabling the query cache is beneficial, test the operation of your MySQL server with the cache enabled and disabled. Then retest periodically because query cache efficiency may change as server workload changes.

8.9.3.1. How the Query Cache Operates

This section describes how the query cache works when it is operational. Section 8.9.3.3, "Query Cache Configuration", describes how to control whether it is operational.

Incoming queries are compared to those in the query cache before parsing, so the following two queries are regarded as different by the query cache:

SELECT * FROM tbl_nameSelect * from tbl_name

Queries must be exactly the same (byte for byte) to be seen as identical. In addition, query strings that are identical may be treated as different for other reasons. Queries that use different databases, different protocol versions, or different default character sets are considered different queries and are cached separately.

The cache is not used for queries of the following types:

Queries that are a subquery of an outer query
Queries executed within the body of a stored function, trigger, or event

Before a query result is fetched from the query cache, MySQL checks whether the user has SELECT privilege for all databases and tables involved. If this is not the case, the cached result is not used.

If a query result is returned from query cache, the server increments the Qcache_hits status variable, not Com_select. See Section 8.9.3.4, "Query Cache Status and Maintenance".

If a table changes, all cached queries that use the table become invalid and are removed from the cache. This includes queries that use MERGE tables that map to the changed table. A table can be changed by many types of statements, such as INSERT, UPDATE, DELETE, TRUNCATE TABLE, ALTER TABLE, DROP TABLE, or DROP DATABASE.

The query cache also works within transactions when using InnoDB tables.

In MySQL 5.5, the result from a SELECT query on a view is cached.

The query cache works for SELECT SQL_CALC_FOUND_ROWS ... queries and stores a value that is returned by a following SELECT FOUND_ROWS() query. FOUND_ROWS() returns the correct value even if the preceding query was fetched from the cache because the number of found rows is also stored in the cache. The SELECT FOUND_ROWS() query itself cannot be cached.

Prepared statements that are issued using the binary protocol using mysql_stmt_prepare() and mysql_stmt_execute() (see Section 22.8.4, "C API Prepared Statements"), are subject to limitations on caching. Comparison with statements in the query cache is based on the text of the statement after expansion of ? parameter markers. The statement is compared only with other cached statements that were executed using the binary protocol. That is, for query cache purposes, prepared statements issued using the binary protocol are distinct from prepared statements issued using the text protocol (see Section 13.5, "SQL Syntax for Prepared Statements").

A query cannot be cached if it contains any of the functions shown in the following table.

A query also is not cached under these conditions:

It refers to user-defined functions (UDFs) or stored functions.
It refers to user variables or local stored program variables.
It refers to tables in the mysql, INFORMATION_SCHEMA, or performance_schema database.
(MySQL 5.5.23 and later:) It refers to any partitioned tables.
It is of any of the following forms:
```
SELECT ... LOCK IN SHARE MODESELECT ... FOR UPDATESELECT ... INTO OUTFILE ...SELECT ... INTO DUMPFILE ...SELECT * FROM ... WHERE autoincrement_col IS NULL
```
The last form is not cached because it is used as the ODBC workaround for obtaining the last insert ID value. See the Connector/ODBC section of Chapter 22, Connectors and APIs.
Statements within transactions that use SERIALIZABLE isolation level also cannot be cached because they use LOCK IN SHARE MODE locking.
It uses TEMPORARY tables.
It does not use any tables.
It generates warnings.
The user has a column-level privilege for any of the involved tables.

8.9.3.2. Query Cache SELECT Options

Two query cache-related options may be specified in SELECT statements:

SQL_CACHE
The query result is cached if it is cacheable and the value of the query_cache_type system variable is ON or DEMAND.
SQL_NO_CACHE
The query result is not cached.

Examples:

SELECT SQL_CACHE id, name FROM customer;SELECT SQL_NO_CACHE id, name FROM customer;

8.9.3.3. Query Cache Configuration

The have_query_cache server system variable indicates whether the query cache is available:

mysql> SHOW VARIABLES LIKE 'have_query_cache';+------------------+-------+| Variable_name | Value |+------------------+-------+| have_query_cache | YES   |+------------------+-------+

When using a standard MySQL binary, this value is always YES, even if query caching is disabled.

Several other system variables control query cache operation. These can be set in an option file or on the command line when starting mysqld. The query cache system variables all have names that begin with query_cache_. They are described briefly in Section 5.1.4, "Server System Variables", with additional configuration information given here.

To set the size of the query cache, set the query_cache_size system variable. Setting it to 0 disables the query cache. The default size is 0, so the query cache is disabled by default. To reduce overhead significantly, also start the server with query_cache_type=0 if you will not be using the query cache.

Note

When using the Windows Configuration Wizard to install or configure MySQL, the default value for query_cache_size will be configured automatically for you based on the different configuration types available. When using the Windows Configuration Wizard, the query cache may be enabled (that is, set to a nonzero value) due to the selected configuration. The query cache is also controlled by the setting of the query_cache_type variable. Check the values of these variables as set in your my.ini file after configuration has taken place.

When you set query_cache_size to a nonzero value, keep in mind that the query cache needs a minimum size of about 40KB to allocate its structures. (The exact size depends on system architecture.) If you set the value too small, you'll get a warning, as in this example:

mysql> SET GLOBAL query_cache_size = 40000;Query OK, 0 rows affected, 1 warning (0.00 sec)mysql> SHOW WARNINGS\G*************************** 1. row ***************************  Level: Warning   Code: 1282Message: Query cache failed to set size 39936; new query cache size is 0mysql> SET GLOBAL query_cache_size = 41984;Query OK, 0 rows affected (0.00 sec)mysql> SHOW VARIABLES LIKE 'query_cache_size';+------------------+-------+| Variable_name | Value |+------------------+-------+| query_cache_size | 41984 |+------------------+-------+

For the query cache to actually be able to hold any query results, its size must be set larger:

mysql> SET GLOBAL query_cache_size = 1000000;Query OK, 0 rows affected (0.04 sec)mysql> SHOW VARIABLES LIKE 'query_cache_size';+------------------+--------+| Variable_name | Value  |+------------------+--------+| query_cache_size | 999424 |+------------------+--------+1 row in set (0.00 sec)

The query_cache_size value is aligned to the nearest 1024 byte block. The value reported may therefore be different from the value that you assign.

If the query cache size is greater than 0, the query_cache_type variable influences how it works. This variable can be set to the following values:

A value of 0 or OFF prevents caching or retrieval of cached results.
A value of 1 or ON enables caching except of those statements that begin with SELECT SQL_NO_CACHE.
A value of 2 or DEMAND causes caching of only those statements that begin with SELECT SQL_CACHE.

If query_cache_size is 0, you should also set query_cache_type variable to 0. In this case, the server does not acquire the query cache mutex at all, which means that the query cache cannot be enabled at runtime and there is reduced overhead in query execution.

Setting the GLOBAL query_cache_type value determines query cache behavior for all clients that connect after the change is made. Individual clients can control cache behavior for their own connection by setting the SESSION query_cache_type value. For example, a client can disable use of the query cache for its own queries like this:

mysql> SET SESSION query_cache_type = OFF;

If you set query_cache_type at server startup (rather than at runtime with a SET statement), only the numeric values are permitted.

To control the maximum size of individual query results that can be cached, set the query_cache_limit system variable. The default value is 1MB.

Be careful not to set the size of the cache too large. Due to the need for threads to lock the cache during updates, you may see lock contention issues with a very large cache.

Note

You can set the maximum size that can be specified for the query cache at run time with the SET statement by using the --maximum-query_cache_size=32M option on the command line or in the configuration file.

When a query is to be cached, its result (the data sent to the client) is stored in the query cache during result retrieval. Therefore the data usually is not handled in one big chunk. The query cache allocates blocks for storing this data on demand, so when one block is filled, a new block is allocated. Because memory allocation operation is costly (timewise), the query cache allocates blocks with a minimum size given by the query_cache_min_res_unit system variable. When a query is executed, the last result block is trimmed to the actual data size so that unused memory is freed. Depending on the types of queries your server executes, you might find it helpful to tune the value of query_cache_min_res_unit:

The default value of query_cache_min_res_unit is 4KB. This should be adequate for most cases.
If you have a lot of queries with small results, the default block size may lead to memory fragmentation, as indicated by a large number of free blocks. Fragmentation can force the query cache to prune (delete) queries from the cache due to lack of memory. In this case, decrease the value of query_cache_min_res_unit. The number of free blocks and queries removed due to pruning are given by the values of the Qcache_free_blocks and Qcache_lowmem_prunes status variables.
If most of your queries have large results (check the Qcache_total_blocks and Qcache_queries_in_cache status variables), you can increase performance by increasing query_cache_min_res_unit. However, be careful to not make it too large (see the previous item).

8.9.3.4. Query Cache Status and Maintenance

To check whether the query cache is present in your MySQL server, use the following statement:

mysql> SHOW VARIABLES LIKE 'have_query_cache';+------------------+-------+| Variable_name | Value |+------------------+-------+| have_query_cache | YES   |+------------------+-------+

You can defragment the query cache to better utilize its memory with the FLUSH QUERY CACHE statement. The statement does not remove any queries from the cache.

The RESET QUERY CACHE statement removes all query results from the query cache. The FLUSH TABLES statement also does this.

To monitor query cache performance, use SHOW STATUS to view the cache status variables:

mysql> SHOW STATUS LIKE 'Qcache%';+-------------------------+--------+| Variable_name   | Value  |+-------------------------+--------+| Qcache_free_blocks  | 36 || Qcache_free_memory  | 138488 || Qcache_hits | 79570  || Qcache_inserts  | 27087  || Qcache_lowmem_prunes | 3114   || Qcache_not_cached   | 22989  || Qcache_queries_in_cache | 415 || Qcache_total_blocks | 912 |+-------------------------+--------+

Descriptions of each of these variables are given in Section 5.1.6, "Server Status Variables". Some uses for them are described here.

The total number of SELECT queries is given by this formula:

  Com_select+ Qcache_hits+ queries with errors found by parser

The Com_select value is given by this formula:

  Qcache_inserts+ Qcache_not_cached+ queries with errors found during the column-privileges check

The query cache uses variable-length blocks, so Qcache_total_blocks and Qcache_free_blocks may indicate query cache memory fragmentation. After FLUSH QUERY CACHE, only a single free block remains.

Every cached query requires a minimum of two blocks (one for the query text and one or more for the query results). Also, every table that is used by a query requires one block. However, if two or more queries use the same table, only one table block needs to be allocated.

The information provided by the Qcache_lowmem_prunes status variable can help you tune the query cache size. It counts the number of queries that have been removed from the cache to free up memory for caching new queries. The query cache uses a least recently used (LRU) strategy to decide which queries to remove from the cache. Tuning information is given in Section 8.9.3.3, "Query Cache Configuration".

8.10. Optimizing Locking Operations

8.10.1. Internal Locking Methods
8.10.2. Table Locking Issues
8.10.3. Concurrent Inserts
8.10.4. Metadata Locking Within Transactions
8.10.5. External Locking

When your database is busy with multiple sessions reading and writing data, the mechanism that controls access to data files and memory areas can become a consideration for performance tuning. Otherwise, sessions can spend time waiting for access to resources when they could be running concurrently.

MySQL manages contention for table contents using locking:

Internal locking is performed within the MySQL server itself to manage contention for table contents by multiple threads. This type of locking is internal because it is performed entirely by the server and involves no other programs. See Section 8.10.1, "Internal Locking Methods".
External locking occurs when the server and other programs lock MyISAM table files to coordinate among themselves which program can access the tables at which time. See Section 8.10.5, "External Locking".

8.10.1. Internal Locking Methods

This section discusses internal locking; that is, locking performed within the MySQL server to manage contention for table contents by multiple sessions.

MySQL uses row-level locking for InnoDB tables to support simultaneous write access by multiple sessions, making them suitable for multi-user, highly concurrent, and OLTP applications. MySQL uses table-level locking for MyISAM, MEMORY, and MERGE tables, allowing only one session to update those tables at a time, making them more suitable for read-only, read-mostly, or single-user applications.

Considerations for Row Locking

Advantages of row-level locking:

Fewer lock conflicts when different sessions access different rows.
Fewer changes for rollbacks.
Possible to lock a single row for a long time.

Disadvantages of row-level locking:

Requires more memory than table-level locks.
Slower than table-level locks when used on a large part of the table because you must acquire many more locks.
Slower than other locks if you often do GROUP BY operations on a large part of the data or if you must scan the entire table frequently.

Considerations for Table Locking

Table locking in MySQL is deadlock-free for storage engines that use table-level locking. Deadlock avoidance is managed by always requesting all needed locks at once at the beginning of a query and always locking the tables in the same order.

MySQL grants table write locks as follows:

If there are no locks on the table, put a write lock on it.
Otherwise, put the lock request in the write lock queue.

MySQL grants table read locks as follows:

If there are no write locks on the table, put a read lock on it.
Otherwise, put the lock request in the read lock queue.

Table updates are given higher priority than table retrievals. Therefore, when a lock is released, the lock is made available to the requests in the write lock queue and then to the requests in the read lock queue. This ensures that updates to a table are not "starved" even if there is heavy SELECT activity for the table. However, if you have many updates for a table, SELECT statements wait until there are no more updates.

For information on altering the priority of reads and writes, see Section 8.10.2, "Table Locking Issues".

You can analyze the table lock contention on your system by checking the Table_locks_immediate and Table_locks_waited status variables, which indicate the number of times that requests for table locks could be granted immediately and the number that had to wait, respectively:

mysql> SHOW STATUS LIKE 'Table%';+-----------------------+---------+| Variable_name | Value   |+-----------------------+---------+| Table_locks_immediate | 1151552 || Table_locks_waited | 15324   |+-----------------------+---------+

The MyISAM storage engine supports concurrent inserts to reduce contention between readers and writers for a given table: If a MyISAM table has no free blocks in the middle of the data file, rows are always inserted at the end of the data file. In this case, you can freely mix concurrent INSERT and SELECT statements for a MyISAM table without locks. That is, you can insert rows into a MyISAM table at the same time other clients are reading from it. Holes can result from rows having been deleted from or updated in the middle of the table. If there are holes, concurrent inserts are disabled but are enabled again automatically when all holes have been filled with new data.. This behavior is altered by the concurrent_insert system variable. See Section 8.10.3, "Concurrent Inserts".

If you acquire a table lock explicitly with LOCK TABLES, you can request a READ LOCAL lock rather than a READ lock to enable other sessions to perform concurrent inserts while you have the table locked.

To perform many INSERT and SELECT operations on a table real_table when concurrent inserts are not possible, you can insert rows into a temporary table temp_table and update the real table with the rows from the temporary table periodically. This can be done with the following code:

mysql> LOCK TABLES real_table WRITE, temp_table WRITE;mysql> INSERT INTO real_table SELECT * FROM temp_table;mysql> DELETE FROM temp_table;mysql> UNLOCK TABLES;

InnoDB uses row locks. Deadlocks are possible for InnoDB because it automatically acquires locks during the processing of SQL statements, not at the start of the transaction.

Choosing the Type of Locking

Generally, table locks are superior to row-level locks in the following cases:

Most statements for the table are reads.
Statements for the table are a mix of reads and writes, where writes are updates or deletes for a single row that can be fetched with one key read:
```
UPDATE tbl_name SET column=value WHERE unique_key_col=key_value;DELETE FROM tbl_name WHERE unique_key_col=key_value;
```
SELECT combined with concurrent INSERT statements, and very few UPDATE or DELETE statements.
Many scans or GROUP BY operations on the entire table without any writers.

With higher-level locks, you can more easily tune applications by supporting locks of different types, because the lock overhead is less than for row-level locks.

Options other than row-level locking:

Versioning (such as that used in MySQL for concurrent inserts) where it is possible to have one writer at the same time as many readers. This means that the database or table supports different views for the data depending on when access begins. Other common terms for this are "time travel," "copy on write," or "copy on demand."
Copy on demand is in many cases superior to row-level locking. However, in the worst case, it can use much more memory than using normal locks.
Instead of using row-level locks, you can employ application-level locks, such as those provided by GET_LOCK() and RELEASE_LOCK() in MySQL. These are advisory locks, so they work only with applications that cooperate with each other. See Section 12.15, "Miscellaneous Functions".

8.10.2. Table Locking Issues

InnoDB tables use row-level locking so that multiple sessions and applications can read from and write to the same table simultaneously, without making each other wait or producing inconsistent results. For this storage engine, avoid using the LOCK TABLES statement, because it does not offer any extra protection, but instead reduces concurrency. The automatic row-level locking makes these tables suitable for your busiest databases with your most important data, while also simplifying application logic since you do not need to lock and unlock tables. Consequently, the InnoDB storage engine is the default in MySQL 5.5 and higher.

MySQL uses table locking (instead of page, row, or column locking) for all storage engines except InnoDB and NDBCLUSTER. The locking operations themselves do not have much overhead. But because only one session can write to a table at any one time, for best performance with these other storage engines, use them primarily for tables that are queried often and rarely inserted into or updated.

Performance Considerations Favoring InnoDB

When choosing whether to create a table using InnoDB or a different storage engine, keep in mind the following disadvantages of table locking:

Table locking enables many sessions to read from a table at the same time, but if a session wants to write to a table, it must first get exclusive access, meaning it might have to wait for other sessions to finish with the table first. During the update, all other sessions that want to access this particular table must wait until the update is done.
Table locking causes problems when a session is waiting because the disk is full and free space needs to become available before the session can proceed. In this case, all sessions that want to access the problem table are also put in a waiting state until more disk space is made available.
A SELECT statement that takes a long time to run prevents other sessions from updating the table in the meantime, making the other sessions appear slow or unresponsive. While a session is waiting to get exclusive access to the table for updates, other sessions that issue SELECT statements will queue up behind it, reducing concurrency even for read-only sessions.

Workarounds for Locking Performance Issues

The following items describe some ways to avoid or reduce contention caused by table locking:

Consider switching the table to the InnoDB storage engine, either using CREATE TABLE ... ENGINE=INNODB during setup, or using ALTER TABLE ... ENGINE=INNODB for an existing table. See Section 14.3, "The InnoDB Storage Engine" for more details about this storage engine.
Optimize SELECT statements to run faster so that they lock tables for a shorter time. You might have to create some summary tables to do this.
Start mysqld with --low-priority-updates. For storage engines that use only table-level locking (such as MyISAM, MEMORY, and MERGE), this gives all statements that update (modify) a table lower priority than SELECT statements. In this case, the second SELECT statement in the preceding scenario would execute before the UPDATE statement, and would not wait for the first SELECT to finish.
To specify that all updates issued in a specific connection should be done with low priority, set the low_priority_updates server system variable equal to 1.
To give a specific INSERT, UPDATE, or DELETE statement lower priority, use the LOW_PRIORITY attribute.
To give a specific SELECT statement higher priority, use the HIGH_PRIORITY attribute. See Section 13.2.9, "SELECT Syntax".
Start mysqld with a low value for the max_write_lock_count system variable to force MySQL to temporarily elevate the priority of all SELECT statements that are waiting for a table after a specific number of inserts to the table occur. This permits READ locks after a certain number of WRITE locks.
If you have problems with INSERT combined with SELECT, consider switching to MyISAM tables, which support concurrent SELECT and INSERT statements. (See Section 8.10.3, "Concurrent Inserts".)
If you mix inserts and deletes on the same table, INSERT DELAYED may be of great help. See Section 13.2.5.2, "INSERT DELAYED Syntax".
If you have problems with mixed SELECT and DELETE statements, the LIMIT option to DELETE may help. See Section 13.2.2, "DELETE Syntax".
Using SQL_BUFFER_RESULT with SELECT statements can help to make the duration of table locks shorter. See Section 13.2.9, "SELECT Syntax".
Splitting table contents into separate tables may help, by allowing queries to run against columns in one table, while updates are confined to columns in a different table.
You could change the locking code in mysys/thr_lock.c to use a single queue. In this case, write locks and read locks would have the same priority, which might help some applications.

8.10.3. Concurrent Inserts

The MyISAM storage engine supports concurrent inserts to reduce contention between readers and writers for a given table: If a MyISAM table has no holes in the data file (deleted rows in the middle), an INSERT statement can be executed to add rows to the end of the table at the same time that SELECT statements are reading rows from the table. If there are multiple INSERT statements, they are queued and performed in sequence, concurrently with the SELECT statements. The results of a concurrent INSERT may not be visible immediately.

The concurrent_insert system variable can be set to modify the concurrent-insert processing. By default, the variable is set to AUTO (or 1) and concurrent inserts are handled as just described. If concurrent_insert is set to NEVER (or 0), concurrent inserts are disabled. If the variable is set to ALWAYS (or 2), concurrent inserts at the end of the table are permitted even for tables that have deleted rows. See also the description of the concurrent_insert system variable.

Under circumstances where concurrent inserts can be used, there is seldom any need to use the DELAYED modifier for INSERT statements. See Section 13.2.5.2, "INSERT DELAYED Syntax".

If you are using the binary log, concurrent inserts are converted to normal inserts for CREATE ... SELECT or INSERT ... SELECT statements. This is done to ensure that you can re-create an exact copy of your tables by applying the log during a backup operation. See Section 5.2.4, "The Binary Log". In addition, for those statements a read lock is placed on the selected-from table such that inserts into that table are blocked. The effect is that concurrent inserts for that table must wait as well.

With LOAD DATA INFILE, if you specify CONCURRENT with a MyISAM table that satisfies the condition for concurrent inserts (that is, it contains no free blocks in the middle), other sessions can retrieve data from the table while LOAD DATA is executing. Use of the CONCURRENT option affects the performance of LOAD DATA a bit, even if no other session is using the table at the same time.

If you specify HIGH_PRIORITY, it overrides the effect of the --low-priority-updates option if the server was started with that option. It also causes concurrent inserts not to be used.

For LOCK TABLE, the difference between READ LOCAL and READ is that READ LOCAL permits nonconflicting INSERT statements (concurrent inserts) to execute while the lock is held. However, this cannot be used if you are going to manipulate the database using processes external to the server while you hold the lock.

8.10.4. Metadata Locking Within Transactions

To ensure transaction serializability, the server must not permit one session to perform a data definition language (DDL) statement on a table that is used in an uncompleted transaction in another session.

As of MySQL 5.5.3, the server achieves this by acquiring metadata locks on tables used within a transaction and deferring release of those locks until the transaction ends. A metadata lock on a table prevents changes to the table's structure. This locking approach has the implication that a table that is being used by a transaction within one session cannot be used in DDL statements by other sessions until the transaction ends. For example, if a table t1 is in use by a transaction, another session that attempts to execute DROP TABLE t1 blocks until the transaction ends.

If the server acquires metadata locks for a statement that is syntactically valid but fails during execution, it does not release the locks early. Lock release is still deferred to the end of the transaction because the failed statement is written to the binary log and the locks protect log consistency.

Metadata locks acquired during a PREPARE statement are released once the statement has been prepared, even if preparation occurs within a multiple-statement transaction.

Before MySQL 5.5.3, when a transaction acquired a metadata lock for a table used within a statement, it released the lock at the end of the statement. This approach had the disadvantage that if a DDL statement occurred for a table that was being used by another session in an active transaction, statements could be written to the binary log in the wrong order.

8.10.5. External Locking

External locking is the use of file system locking to manage contention for MyISAM database tables by multiple processes. External locking is used in situations where a single process such as the MySQL server cannot be assumed to be the only process that requires access to tables. Here are some examples:

If you run multiple servers that use the same database directory (not recommended), each server must have external locking enabled.
If you use myisamchk to perform table maintenance operations on MyISAM tables, you must either ensure that the server is not running, or that the server has external locking enabled so that it locks table files as necessary to coordinate with myisamchk for access to the tables. The same is true for use of myisampack to pack MyISAM tables.
If the server is run with external locking enabled, you can use myisamchk at any time for read operations such a checking tables. In this case, if the server tries to update a table that myisamchk is using, the server will wait for myisamchk to finish before it continues.
If you use myisamchk for write operations such as repairing or optimizing tables, or if you use myisampack to pack tables, you must always ensure that the mysqld server is not using the table. If you don't stop mysqld, at least do a mysqladmin flush-tables before you run myisamchk. Your tables may become corrupted if the server and myisamchk access the tables simultaneously.

With external locking in effect, each process that requires access to a table acquires a file system lock for the table files before proceeding to access the table. If all necessary locks cannot be acquired, the process is blocked from accessing the table until the locks can be obtained (after the process that currently holds the locks releases them).

External locking affects server performance because the server must sometimes wait for other processes before it can access tables.

External locking is unnecessary if you run a single server to access a given data directory (which is the usual case) and if no other programs such as myisamchk need to modify tables while the server is running. If you only read tables with other programs, external locking is not required, although myisamchk might report warnings if the server changes tables while myisamchk is reading them.

With external locking disabled, to use myisamchk, you must either stop the server while myisamchk executes or else lock and flush the tables before running myisamchk. (See Section 8.11.1, "System Factors and Startup Parameter Tuning".) To avoid this requirement, use the CHECK TABLE and REPAIR TABLE statements to check and repair MyISAM tables.

For mysqld, external locking is controlled by the value of the skip_external_locking system variable. When this variable is enabled, external locking is disabled, and vice versa. From MySQL 4.0 on, external locking is disabled by default.

Use of external locking can be controlled at server startup by using the --external-locking or --skip-external-locking option.

If you do use external locking option to enable updates to MyISAM tables from many MySQL processes, you must ensure that the following conditions are satisfied:

Do not use the query cache for queries that use tables that are updated by another process.
Do not start the server with the --delay-key-write=ALL option or use the DELAY_KEY_WRITE=1 table option for any shared tables. Otherwise, index corruption can occur.

The easiest way to satisfy these conditions is to always use --external-locking together with --delay-key-write=OFF and --query-cache-size=0. (This is not done by default because in many setups it is useful to have a mixture of the preceding options.)

8.11. Optimizing the MySQL Server

8.11.1. System Factors and Startup Parameter Tuning
8.11.2. Tuning Server Parameters
8.11.3. Optimizing Disk I/O
8.11.4. Optimizing Memory Use
8.11.5. Optimizing Network Use
8.11.6. The Thread Pool Plugin

This section discusses optimization techniques for the database server, primarily dealing with system configuration rather than tuning SQL statements. The information in this section is appropriate for DBAs who want to ensure performance and scalability across the servers they manage; for developers constructing installation scripts that include setting up the database; and people running MySQL themselves for development, testing, and so on who want to maximize their own productivity.

8.11.1. System Factors and Startup Parameter Tuning

We start with system-level factors, because some of these decisions must be made very early to achieve large performance gains. In other cases, a quick look at this section may suffice. However, it is always nice to have a sense of how much can be gained by changing factors that apply at this level.

The operating system to use is very important. To get the best use of multiple-CPU machines, you should use Solaris (because its threads implementation works well) or Linux (because the 2.4 and later kernels have good SMP support). Note that older Linux kernels have a 2GB filesize limit by default. If you have such a kernel and a need for files larger than 2GB, get the Large File Support (LFS) patch for the ext2 file system. Other file systems such as ReiserFS and XFS do not have this 2GB limitation.

Before using MySQL in production, test it on your intended platform.

Other tips:

If you have enough RAM, you could remove all swap devices. Some operating systems use a swap device in some contexts even if you have free memory.
Avoid external locking for MyISAM tables. Since MySQL 4.0, the default has been for external locking to be disabled on all systems. The --external-locking and --skip-external-locking options explicitly enable and disable external locking.
Note that disabling external locking does not affect MySQL's functionality as long as you run only one server. Just remember to take down the server (or lock and flush the relevant tables) before you run myisamchk. On some systems it is mandatory to disable external locking because it does not work, anyway.
The only case in which you cannot disable external locking is when you run multiple MySQL servers (not clients) on the same data, or if you run myisamchk to check (not repair) a table without telling the server to flush and lock the tables first. Note that using multiple MySQL servers to access the same data concurrently is generally not recommended, except when using MySQL Cluster.
The LOCK TABLES and UNLOCK TABLES statements use internal locking, so you can use them even if external locking is disabled.

8.11.2. Tuning Server Parameters

You can determine the default buffer sizes used by the mysqld server using this command:

shell> mysqld --verbose --help

This command produces a list of all mysqld options and configurable system variables. The output includes the default variable values and looks something like this:

abort-slave-event-count   0allow-suspicious-udfs FALSEarchive   ONauto-increment-increment  1auto-increment-offset 1autocommit TRUEautomatic-sp-privileges   TRUEback-log  50basedir   /home/jon/bin/mysql-5.5-cluster/big-tables FALSEbind-address  (No default value)binlog-cache-size 32768binlog-direct-non-transactional-updates   FALSEbinlog-format STATEMENTbinlog-row-event-max-size 1024binlog-stmt-cache-size 32768blackhole ONbulk-insert-buffer-size   8388608character-set-client-handshake TRUEcharacter-set-filesystem  binarycharacter-set-server  latin1character-sets-dir   /home/jon/bin/mysql-5.5-cluster/share/charsets/chroot (No default value)collation-server  latin1_swedish_cicompletion-type   NO_CHAINconcurrent-insert AUTOconnect-timeout   10console   FALSEdatadir   (No default value)date-format   %Y-%m-%ddatetime-format   %Y-%m-%d %H:%i:%sdefault-storage-engine InnoDBdefault-time-zone (No default value)default-week-format   0delay-key-write   ONdelayed-insert-limit  100delayed-insert-timeout 300delayed-queue-size 1000des-key-file  (No default value)disconnect-slave-event-count  0div-precision-increment   4engine-condition-pushdown TRUEevent-scheduler   OFFexpire-logs-days  0external-locking  FALSEflush FALSEflush-time 0ft-boolean-syntax + -><()~*:""&|ft-max-word-len   84ft-min-word-len   4ft-query-expansion-limit  20ft-stopword-file  (No default value)gdb   FALSEgeneral-log   FALSEgeneral-log-file /home/jon/bin/mysql-5.5-cluster/data/torsk.loggroup-concat-max-len  1024help  TRUEignore-builtin-innodb FALSEinit-connectinit-file (No default value)init-rpl-role MASTERinit-slaveinnodb ONinnodb-adaptive-flushing  TRUEinnodb-adaptive-hash-index TRUEinnodb-additional-mem-pool-size   8388608innodb-autoextend-increment   8innodb-autoinc-lock-mode  1innodb-buffer-pool-instances  1innodb-buffer-pool-size   134217728innodb-change-buffering   allinnodb-checksums  TRUEinnodb-cmp ONinnodb-cmp-reset  ONinnodb-cmpmem ONinnodb-cmpmem-reset   ONinnodb-commit-concurrency 0innodb-concurrency-tickets 500innodb-data-file-path (No default value)innodb-data-home-dir  (No default value)innodb-doublewrite TRUEinnodb-fast-shutdown  1innodb-file-format Antelopeinnodb-file-format-check  TRUEinnodb-file-format-max Antelopeinnodb-file-io-threads 4innodb-file-per-table FALSEinnodb-flush-log-at-trx-commit 1innodb-flush-method   (No default value)innodb-force-recovery 0innodb-io-capacity 200innodb-large-prefix   FALSEinnodb-lock-wait-timeout  50innodb-lock-waits ONinnodb-locks  ONinnodb-locks-unsafe-for-binlog FALSEinnodb-log-buffer-size 8388608innodb-log-file-size  5242880innodb-log-files-in-group 2innodb-log-group-home-dir (No default value)innodb-max-dirty-pages-pct 75innodb-max-purge-lag  0innodb-mirrored-log-groups 1innodb-old-blocks-pct 37innodb-old-blocks-time 0innodb-open-files 300innodb-purge-batch-size   20innodb-purge-threads  0innodb-read-ahead-threshold   56innodb-read-io-threads 4innodb-replication-delay  0innodb-rollback-on-timeout FALSEinnodb-rollback-segments  128innodb-spin-wait-delay 6innodb-stats-method   nulls_equalinnodb-stats-on-metadata  TRUEinnodb-stats-sample-pages 8innodb-status-file FALSEinnodb-strict-mode FALSEinnodb-support-xa TRUEinnodb-sync-spin-loops 30innodb-table-locks TRUEinnodb-thread-concurrency 0innodb-thread-sleep-delay 10000innodb-trx ONinnodb-use-native-aio TRUEinnodb-use-sys-malloc TRUEinnodb-write-io-threads   4interactive-timeout   28800join-buffer-size  131072keep-files-on-create  FALSEkey-buffer-size   8388608key-cache-age-threshold   300key-cache-block-size  1024key-cache-division-limit  100language   /home/jon/bin/mysql-5.5-cluster/share/large-pages   FALSElc-messages   en_USlc-messages-dir /home/jon/bin/mysql-5.5-cluster/share/lc-time-names en_USlocal-infile  TRUElock-wait-timeout 31536000log /home/jon/bin/mysql-5.5-cluster/data/torsk.loglog-bin   mister-pibblog-bin-index (No default value)log-bin-trust-function-creators   FALSElog-errorlog-isam  myisam.loglog-output FILElog-queries-not-using-indexes FALSElog-short-format  FALSElog-slave-updates FALSElog-slow-admin-statements FALSElog-slow-queries  /home/jon/bin/mysql-5.5-cluster/data/torsk-slow.loglog-slow-slave-statements FALSElog-tc tc.loglog-tc-size   24576log-warnings  1long-query-time   10low-priority-updates  FALSElower-case-table-names 0master-info-file  master.infomaster-retry-count 86400max-allowed-packet 1048576max-binlog-cache-size 18446744073709547520max-binlog-dump-events 0max-binlog-size   1073741824max-binlog-stmt-cache-size 18446744073709547520max-connect-errors 10max-connections   151max-delayed-threads   20max-error-count   64max-heap-table-size   16777216max-join-size 18446744073709551615max-length-for-sort-data  1024max-long-data-size 1048576max-prepared-stmt-count   16382max-relay-log-size 0max-seeks-for-key 18446744073709551615max-sort-length   1024max-sp-recursion-depth 0max-tmp-tables 32max-user-connections  0max-write-lock-count  18446744073709551615memlock   FALSEmin-examined-row-limit 0multi-range-count 256myisam-block-size 1024myisam-data-pointer-size  6myisam-max-sort-file-size 9223372036853727232myisam-mmap-size  18446744073709551615myisam-recover-options OFFmyisam-repair-threads 1myisam-sort-buffer-size   8388608myisam-stats-method   nulls_unequalmyisam-use-mmap   FALSEndb-autoincrement-prefetch-sz 1ndb-batch-size 32768ndb-blob-read-batch-bytes 65536ndb-blob-write-batch-bytes 65536ndb-cache-check-time  0ndb-cluster-connection-pool   1ndb-connectstring (No default value)ndb-deferred-constraints  0ndb-distribution  KEYHASHndb-extra-logging 1ndb-force-send TRUEndb-index-stat-cache-entries  32ndb-index-stat-enable TRUEndb-index-stat-option loop_checkon=1000ms,loop_idle=1000ms, loop_busy=100ms, update_batch=1, read_batch=4, idle_batch=32,check_batch=32, check_delay=1m, delete_batch=8, clean_delay=0, error_batch=4,error_delay=1m, evict_batch=8, evict_delay=1m, cache_limit=32M, cache_lowpct=90ndb-index-stat-update-freq 20ndb-join-pushdown TRUEndb-log-apply-status  FALSEndb-log-bin   TRUEndb-log-binlog-index  TRUEndb-log-empty-epochs  FALSEndb-log-orig  FALSEndb-log-transaction-id FALSEndb-log-update-as-write   TRUEndb-log-updated-only  TRUEndb-mgmd-host (No default value)ndb-nodeid 0ndb-optimization-delay 10ndb-optimized-node-selection  3ndb-report-thresh-binlog-epoch-slip   3ndb-report-thresh-binlog-mem-usage 10ndb-table-no-logging  FALSEndb-table-temporary   FALSEndb-use-copying-alter-table   FALSEndb-use-exact-count   FALSEndb-use-transactions  TRUEndb-wait-connected 0ndb-wait-setup 15ndbcluster ONndbinfo   ONndbinfo-database  ndbinfondbinfo-max-bytes 0ndbinfo-max-rows  10ndbinfo-show-hidden   FALSEndbinfo-table-prefix  ndb$net-buffer-length 16384net-read-timeout  30net-retry-count   10net-write-timeout 60new   FALSEold   FALSEold-alter-table   FALSEold-passwords FALSEold-style-user-limits FALSEopen-files-limit  1024optimizer-prune-level 1optimizer-search-depth 62optimizer-switch  index_merge=on,index_merge_union=on, index_merge_sort_union=on, index_merge_intersection=on,engine_condition_pushdown=onpartition ONperformance-schema FALSEperformance-schema-events-waits-history-long-size 10000performance-schema-events-waits-history-size  10performance-schema-max-cond-classes   80performance-schema-max-cond-instances 1000performance-schema-max-file-classes   50performance-schema-max-file-handles   32768performance-schema-max-file-instances 10000performance-schema-max-mutex-classes  200performance-schema-max-mutex-instances 1000000performance-schema-max-rwlock-classes 30performance-schema-max-rwlock-instances   1000000performance-schema-max-table-handles  100000performance-schema-max-table-instances 50000performance-schema-max-thread-classes 50performance-schema-max-thread-instances   1000pid-file /home/jon/bin/mysql-5.5-cluster/data/torsk.pidplugin-dir  /home/jon/bin/mysql-5.5-cluster/lib/pluginplugin-load   (No default value)port  3306port-open-timeout 0preload-buffer-size   32768profiling-history-size 15query-alloc-block-size 8192query-cache-limit 1048576query-cache-min-res-unit  4096query-cache-size  0query-cache-type  ONquery-cache-wlock-invalidate  FALSEquery-prealloc-size   8192range-alloc-block-size 4096read-buffer-size  131072read-only FALSEread-rnd-buffer-size  262144relay-log (No default value)relay-log-index   (No default value)relay-log-info-file   relay-log.inforelay-log-purge   TRUErelay-log-recovery FALSErelay-log-space-limit 0replicate-same-server-id  FALSEreport-host   (No default value)report-password   (No default value)report-port   3306report-user   (No default value)rpl-recovery-rank 0safe-user-create  FALSEsecure-auth   FALSEsecure-file-priv  (No default value)server-id 1server-id-bits 32show-slave-auth-info  FALSEskip-grant-tables FALSEskip-name-resolve FALSEskip-networking   FALSEskip-show-database FALSEskip-slave-start  FALSEslave-allow-batching  FALSEslave-compressed-protocol FALSEslave-exec-mode   STRICTslave-load-tmpdir /tmpslave-net-timeout 3600slave-skip-errors (No default value)slave-transaction-retries 10slave-type-conversionsslow-launch-time  2slow-query-log FALSEslow-query-log-file /home/jon/bin/mysql-5.5-cluster/data/torsk-slow.logsocket /tmp/mysql.socksort-buffer-size  2097152sporadic-binlog-dump-fail FALSEsql-modessl   FALSEssl-ca (No default value)ssl-capath (No default value)ssl-cert  (No default value)ssl-cipher (No default value)ssl-key   (No default value)super-large-pages FALSEsymbolic-links TRUEsync-binlog   0sync-frm  TRUEsync-master-info  0sync-relay-log 0sync-relay-log-info   0sysdate-is-now FALSEtable-cache   400table-definition-cache 400table-open-cache  400tc-heuristic-recover  COMMITtemp-pool TRUEthread-cache-size 0thread-concurrency 10thread-handling   one-thread-per-connectionthread-stack  262144time-format   %H:%i:%stimed-mutexes FALSEtmp-table-size 16777216tmpdir /tmptransaction-alloc-block-size  8192transaction-isolation REPEATABLE-READtransaction-prealloc-size 4096updatable-views-with-limit YESverbose   TRUEwait-timeout  28800

For a mysqld server that is currently running, you can see the current values of its system variables by connecting to it and issuing this statement:

mysql> SHOW VARIABLES;

You can also see some statistical and status indicators for a running server by issuing this statement:

mysql> SHOW STATUS;

System variable and status information also can be obtained using mysqladmin:

shell> mysqladmin variablesshell> mysqladmin extended-status

For a full description of all system and status variables, see Section 5.1.4, "Server System Variables", and Section 5.1.6, "Server Status Variables".

MySQL uses algorithms that are very scalable, so you can usually run with very little memory. However, normally you get better performance by giving MySQL more memory.

When tuning a MySQL server, the two most important variables to configure are key_buffer_size and table_open_cache. You should first feel confident that you have these set appropriately before trying to change any other variables.

The following examples indicate some typical variable values for different runtime configurations.

If you have at least 256MB of memory and many tables and want maximum performance with a moderate number of clients, use something like this:
```
shell> mysqld_safe --key_buffer_size=64M --table_open_cache=256 \   --sort_buffer_size=4M --read_buffer_size=1M &
```
If you have only 128MB of memory and only a few tables, but you still do a lot of sorting, you can use something like this:
```
shell> mysqld_safe --key_buffer_size=16M --sort_buffer_size=1M
```
If there are very many simultaneous connections, swapping problems may occur unless mysqld has been configured to use very little memory for each connection. mysqld performs better if you have enough memory for all connections.

With little memory and lots of connections, use something like this:

shell> mysqld_safe --key_buffer_size=512K --sort_buffer_size=100K \   --read_buffer_size=100K &

Or even this:

shell> mysqld_safe --key_buffer_size=512K --sort_buffer_size=16K \   --table_open_cache=32 --read_buffer_size=8K \   --net_buffer_length=1K &

If you are performing GROUP BY or ORDER BY operations on tables that are much larger than your available memory, increase the value of read_rnd_buffer_size to speed up the reading of rows following sorting operations.

You can make use of the example option files included with your MySQL distribution; see Section 5.1.2, "Server Configuration Defaults".

If you specify an option on the command line for mysqld or mysqld_safe, it remains in effect only for that invocation of the server. To use the option every time the server runs, put it in an option file.

To see the effects of a parameter change, do something like this:

shell> mysqld --key_buffer_size=32M --verbose --help

The variable values are listed near the end of the output. Make sure that the --verbose and --help options are last. Otherwise, the effect of any options listed after them on the command line are not reflected in the output.

For information on tuning the InnoDB storage engine, see Section 14.3.14.1, "InnoDB Performance Tuning Tips".

8.11.3. Optimizing Disk I/O

8.11.3.1. Using Symbolic Links

The SHOW COLUMNS and The DESCRIBE statements use BLOB as the type for some columns, thus the temporary table used for the results is an on-disk table.

The tips in Section 8.5.7, "Optimizing InnoDB Disk I/O" help you get more I/O performance out of your existing storage configuration. This section describes ways to configure your storage devices when you can devote more and faster storage hardware to the database server.

Disk seeks are a huge performance bottleneck. This problem becomes more apparent when the amount of data starts to grow so large that effective caching becomes impossible. For large databases where you access data more or less randomly, you can be sure that you need at least one disk seek to read and a couple of disk seeks to write things. To minimize this problem, use disks with low seek times.
Increase the number of available disk spindles (and thereby reduce the seek overhead) by either symlinking files to different disks or striping the disks:
- Using symbolic links
  This means that, for MyISAM tables, you symlink the index file and data files from their usual location in the data directory to another disk (that may also be striped). This makes both the seek and read times better, assuming that the disk is not used for other purposes as well. See Section 8.11.3.1, "Using Symbolic Links".
- Striping
  Striping means that you have many disks and put the first block on the first disk, the second block on the second disk, and the N-th block on the (N MOD number_of_disks) disk, and so on. This means if your normal data size is less than the stripe size (or perfectly aligned), you get much better performance. Striping is very dependent on the operating system and the stripe size, so benchmark your application with different stripe sizes. See Section 8.12.3, "Using Your Own Benchmarks".
  The speed difference for striping is very dependent on the parameters. Depending on how you set the striping parameters and number of disks, you may get differences measured in orders of magnitude. You have to choose to optimize for random or sequential access.
For reliability, you may want to use RAID 0+1 (striping plus mirroring), but in this case, you need 2 � N drives to hold N drives of data. This is probably the best option if you have the money for it. However, you may also have to invest in some volume-management software to handle it efficiently.
A good option is to vary the RAID level according to how critical a type of data is. For example, store semi-important data that can be regenerated on a RAID 0 disk, but store really important data such as host information and logs on a RAID 0+1 or RAID N disk. RAID N can be a problem if you have many writes, due to the time required to update the parity bits.
On Linux, you can get much better performance by using hdparm to configure your disk's interface. (Up to 100% under load is not uncommon.) The following hdparm options should be quite good for MySQL, and probably for many other applications:
```
hdparm -m 16 -d 1
```
Note that performance and reliability when using this command depend on your hardware, so we strongly suggest that you test your system thoroughly after using hdparm. Please consult the hdparm manual page for more information. If hdparm is not used wisely, file system corruption may result, so back up everything before experimenting!
You can also set the parameters for the file system that the database uses:
If you do not need to know when files were last accessed (which is not really useful on a database server), you can mount your file systems with the -o noatime option. That skips updates to the last access time in inodes on the file system, which avoids some disk seeks.
On many operating systems, you can set a file system to be updated asynchronously by mounting it with the -o async option. If your computer is reasonably stable, this should give you better performance without sacrificing too much reliability. (This flag is on by default on Linux.)

8.11.3.1. Using Symbolic Links

You can move databases or tables from the database directory to other locations and replace them with symbolic links to the new locations. You might want to do this, for example, to move a database to a file system with more free space or increase the speed of your system by spreading your tables to different disks.

The recommended way to do this is to symlink entire database directories to a different disk. Symlink MyISAM tables only as a last resort.

To determine the location of your data directory, use this statement:

SHOW VARIABLES LIKE 'datadir';

8.11.3.1.1. Using Symbolic Links for Databases on Unix

On Unix, the way to symlink a database is first to create a directory on some disk where you have free space and then to create a soft link to it from the MySQL data directory.

shell> mkdir /dr1/databases/testshell> ln -s /dr1/databases/test /path/to/datadir

MySQL does not support linking one directory to multiple databases. Replacing a database directory with a symbolic link works as long as you do not make a symbolic link between databases. Suppose that you have a database db1 under the MySQL data directory, and then make a symlink db2 that points to db1:

shell> cd /path/to/datadirshell> ln -s db1 db2

The result is that, or any table tbl_a in db1, there also appears to be a table tbl_a in db2. If one client updates db1.tbl_a and another client updates db2.tbl_a, problems are likely to occur.

8.11.3.1.2. Using Symbolic Links for MyISAM Tables on Unix

Symlinks are fully supported only for MyISAM tables. For files used by tables for other storage engines, you may get strange problems if you try to use symbolic links.

Do not symlink tables on systems that do not have a fully operational realpath() call. (Linux and Solaris support realpath()). To determine whether your system supports symbolic links, check the value of the have_symlink system variable using this statement:

SHOW VARIABLES LIKE 'have_symlink';

The handling of symbolic links for MyISAM tables works as follows:

In the data directory, you always have the table format (.frm) file, the data (.MYD) file, and the index (.MYI) file. The data file and index file can be moved elsewhere and replaced in the data directory by symlinks. The format file cannot.
You can symlink the data file and the index file independently to different directories.
To instruct a running MySQL server to perform the symlinking, use the DATA DIRECTORY and INDEX DIRECTORY options to CREATE TABLE. See Section 13.1.17, "CREATE TABLE Syntax". Alternatively, if mysqld is not running, symlinking can be accomplished manually using ln -s from the command line.
Note
The path used with either or both of the DATA DIRECTORY and INDEX DIRECTORY options may not include the MySQL data directory. (Bug #32167)
myisamchk does not replace a symlink with the data file or index file. It works directly on the file to which the symlink points. Any temporary files are created in the directory where the data file or index file is located. The same is true for the ALTER TABLE, OPTIMIZE TABLE, and REPAIR TABLE statements.
Note
When you drop a table that is using symlinks, both the symlink and the file to which the symlink points are dropped. This is an extremely good reason not to run mysqld as the system root or permit system users to have write access to MySQL database directories.
If you rename a table with ALTER TABLE ... RENAME or RENAME TABLE and you do not move the table to another database, the symlinks in the database directory are renamed to the new names and the data file and index file are renamed accordingly.
If you use ALTER TABLE ... RENAME or RENAME TABLE to move a table to another database, the table is moved to the other database directory. If the table name changed, the symlinks in the new database directory are renamed to the new names and the data file and index file are renamed accordingly.
If you are not using symlinks, start mysqld with the --skip-symbolic-links option to ensure that no one can use mysqld to drop or rename a file outside of the data directory.

These table symlink operations are not supported:

ALTER TABLE ignores the DATA DIRECTORY and INDEX DIRECTORY table options.
As indicated previously, only the data and index files can be symbolic links. The .frm file must never be a symbolic link. Attempting to do this (for example, to make one table name a synonym for another) produces incorrect results. Suppose that you have a database db1 under the MySQL data directory, a table tbl1 in this database, and in the db1 directory you make a symlink tbl2 that points to tbl1:
```
shell> cd /path/to/datadir/db1shell> ln -s tbl1.frm tbl2.frmshell> ln -s tbl1.MYD tbl2.MYDshell> ln -s tbl1.MYI tbl2.MYI
```
Problems result if one thread reads db1.tbl1 and another thread updates db1.tbl2:
- The query cache is "fooled" (it has no way of knowing that tbl1 has not been updated, so it returns outdated results).
- ALTER statements on tbl2 fail.

8.11.3.1.3. Using Symbolic Links for Databases on Windows

On Windows, symbolic links can be used for database directories. This enables you to put a database directory at a different location (for example, on a different disk) by setting up a symbolic link to it. Use of database symlinks on Windows is similar to their use on Unix, although the procedure for setting up the link differs.

Suppose that you want to place the database directory for a database named mydb at D:\data\mydb. To do this, create a symbolic link in the MySQL data directory that points to D:\data\mydb. However, before creating the symbolic link, make sure that the D:\data\mydb directory exists by creating it if necessary. If you already have a database directory named mydb in the data directory, move it to D:\data. Otherwise, the symbolic link will be ineffective. To avoid problems, make sure that the server is not running when you move the database directory.

The procedure for creating the database symbolic link depends on your version of Windows.

Windows Vista, Windows Server 2008, or newer have native symbolic link support, so you can create a symlink using the mklink command. This command requires administrative privileges.

Change location into the data directory:
```
C:\> cd \path\to\datadir
```
In the data directory, create a symlink named mydb that points to the location of the database directory:
```
C:\> mklink /d mydb D:\data\mydb
```

After this, all tables created in the database mydb are created in D:\data\mydb.

Alternatively, on any version of Windows supported by MySQL, you can create a symbolic link to a MySQL database by creating a .sym file in the data directory that contains the path to the destination directory. The file should be named db_name.sym, where db_name is the database name.

Support for database symbolic links on Windows using .sym files is enabled by default. If you do not need .sym file symbolic links, you can disable support for them by starting mysqld with the --skip-symbolic-links option. To determine whether your system supports .sym file symbolic links, check the value of the have_symlink system variable using this statement:

SHOW VARIABLES LIKE 'have_symlink';

To create a .sym file symlink, use this procedure:

Change location into the data directory:
```
C:\> cd \path\to\datadir
```
In the data directory, create a text file named mydb.sym that contains this path name: D:\data\mydb\
Note
The path name to the new database and tables should be absolute. If you specify a relative path, the location will be relative to the mydb.sym file.

After this, all tables created in the database mydb are created in D:\data\mydb.

The following limitations apply to the use of .sym files for database symbolic linking on Windows. These limitations do not apply for symlinks created using mklink.

The symbolic link is not used if a directory with the same name as the database exists in the MySQL data directory.
The --innodb_file_per_table option cannot be used.
If you run mysqld as a service, you cannot use a mapped drive to a remote server as the destination of the symbolic link. As a workaround, you can use the full path (\\servername\path\).

8.11.4. Optimizing Memory Use

8.11.4.1. How MySQL Uses Memory
8.11.4.2. Enabling Large Page Support

8.11.4.1. How MySQL Uses Memory

The following list indicates some of the ways that the mysqld server uses memory. Where applicable, the name of the system variable relevant to the memory use is given:

All threads share the MyISAM key buffer; its size is determined by the key_buffer_size variable. Other buffers used by the server are allocated as needed. See Section 8.11.2, "Tuning Server Parameters".
Each thread that is used to manage client connections uses some thread-specific space. The following list indicates these and which variables control their size:
- A stack (variable thread_stack)
- A connection buffer (variable net_buffer_length)
- A result buffer (variable net_buffer_length)
The connection buffer and result buffer each begin with a size equal to net_buffer_length bytes, but are dynamically enlarged up to max_allowed_packet bytes as needed. The result buffer shrinks to net_buffer_length bytes after each SQL statement. While a statement is running, a copy of the current statement string is also allocated.
All threads share the same base memory.
When a thread is no longer needed, the memory allocated to it is released and returned to the system unless the thread goes back into the thread cache. In that case, the memory remains allocated.
The myisam_use_mmap system variable can be set to 1 to enable memory-mapping for all MyISAM tables.
Each request that performs a sequential scan of a table allocates a read buffer (variable read_buffer_size).
When reading rows in an arbitrary sequence (for example, following a sort), a random-read buffer (variable read_rnd_buffer_size) may be allocated to avoid disk seeks.
All joins are executed in a single pass, and most joins can be done without even using a temporary table. Most temporary tables are memory-based hash tables. Temporary tables with a large row length (calculated as the sum of all column lengths) or that contain BLOB columns are stored on disk.
If an internal in-memory temporary table becomes too large, MySQL handles this automatically by changing the table from in-memory to on-disk format, to be handled by the MyISAM storage engine. You can increase the permissible temporary table size as described in Section 8.4.3.3, "How MySQL Uses Internal Temporary Tables".
Most requests that perform a sort allocate a sort buffer and zero to two temporary files depending on the result set size. See Section C.5.4.4, "Where MySQL Stores Temporary Files".
Almost all parsing and calculating is done in thread-local and reusable memory pools. No memory overhead is needed for small items, so the normal slow memory allocation and freeing is avoided. Memory is allocated only for unexpectedly large strings.
For each MyISAM table that is opened, the index file is opened once; the data file is opened once for each concurrently running thread. For each concurrent thread, a table structure, column structures for each column, and a buffer of size 3 * N are allocated (where N is the maximum row length, not counting BLOB columns). A BLOB column requires five to eight bytes plus the length of the BLOB data. The MyISAM storage engine maintains one extra row buffer for internal use.
For each table having BLOB columns, a buffer is enlarged dynamically to read in larger BLOB values. If you scan a table, a buffer as large as the largest BLOB value is allocated.
Handler structures for all in-use tables are saved in a cache and managed as a FIFO. The initial cache size is taken from the value of the table_open_cache system variable. If a table has been used by two running threads at the same time, the cache contains two entries for the table. See Section 8.4.3.1, "How MySQL Opens and Closes Tables".
A FLUSH TABLES statement or mysqladmin flush-tables command closes all tables that are not in use at once and marks all in-use tables to be closed when the currently executing thread finishes. This effectively frees most in-use memory. FLUSH TABLES does not return until all tables have been closed.
The server caches information in memory as a result of GRANT, CREATE USER, CREATE SERVER, and INSTALL PLUGIN statements. This memory is not released by the corresponding REVOKE, DROP USER, DROP SERVER, and UNINSTALL PLUGIN statements, so for a server that executes many instances of the statements that cause caching, there will be an increase in memory use. This cached memory can be freed with FLUSH PRIVILEGES.

ps and other system status programs may report that mysqld uses a lot of memory. This may be caused by thread stacks on different memory addresses. For example, the Solaris version of ps counts the unused memory between stacks as used memory. To verify this, check available swap with swap -s. We test mysqld with several memory-leakage detectors (both commercial and Open Source), so there should be no memory leaks.

8.11.4.2. Enabling Large Page Support

Some hardware/operating system architectures support memory pages greater than the default (usually 4KB). The actual implementation of this support depends on the underlying hardware and operating system. Applications that perform a lot of memory accesses may obtain performance improvements by using large pages due to reduced Translation Lookaside Buffer (TLB) misses.

In MySQL, large pages can be used by InnoDB, to allocate memory for its buffer pool and additional memory pool.

Standard use of large pages in MySQL attempts to use the largest size supported, up to 4MB. Under Solaris, a "super large pages" feature enables uses of pages up to 256MB. This feature is available for recent SPARC platforms. It can be enabled or disabled by using the --super-large-pages or --skip-super-large-pages option.

MySQL also supports the Linux implementation of large page support (which is called HugeTLB in Linux).

Before large pages can be used on Linux, the kernel must be enabled to support them and it is necessary to configure the HugeTLB memory pool. For reference, the HugeTBL API is documented in the Documentation/vm/hugetlbpage.txt file of your Linux sources.

The kernel for some recent systems such as Red Hat Enterprise Linux appear to have the large pages feature enabled by default. To check whether this is true for your kernel, use the following command and look for output lines containing "huge":

shell> cat /proc/meminfo | grep -i hugeHugePages_Total:   0HugePages_Free: 0HugePages_Rsvd: 0HugePages_Surp: 0Hugepagesize:   4096 kB

The nonempty command output indicates that large page support is present, but the zero values indicate that no pages are configured for use.

If your kernel needs to be reconfigured to support large pages, consult the hugetlbpage.txt file for instructions.

Assuming that your Linux kernel has large page support enabled, configure it for use by MySQL using the following commands. Normally, you put these in an rc file or equivalent startup file that is executed during the system boot sequence, so that the commands execute each time the system starts. The commands should execute early in the boot sequence, before the MySQL server starts. Be sure to change the allocation numbers and the group number as appropriate for your system.

# Set the number of pages to be used.# Each page is normally 2MB, so a value of 20 = 40MB.# This command actually allocates memory, so this much# memory must be available.echo 20 > /proc/sys/vm/nr_hugepages# Set the group number that is permitted to access this# memory (102 in this case). The mysql user must be a# member of this group.echo 102 > /proc/sys/vm/hugetlb_shm_group# Increase the amount of shmem permitted per segment# (12G in this case).echo 1560281088 > /proc/sys/kernel/shmmax# Increase total amount of shared memory.  The value# is the number of pages. At 4KB/page, 4194304 = 16GB.echo 4194304 > /proc/sys/kernel/shmall

For MySQL usage, you normally want the value of shmmax to be close to the value of shmall.

To verify the large page configuration, check /proc/meminfo again as described previously. Now you should see some nonzero values:

shell> cat /proc/meminfo | grep -i hugeHugePages_Total:  20HugePages_Free:   20HugePages_Rsvd: 0HugePages_Surp: 0Hugepagesize:   4096 kB

The final step to make use of the hugetlb_shm_group is to give the mysql user an "unlimited" value for the memlock limit. This can by done either by editing /etc/security/limits.conf or by adding the following command to your mysqld_safe script:

ulimit -l unlimited

Adding the ulimit command to mysqld_safe causes the root user to set the memlock limit to unlimited before switching to the mysql user. (This assumes that mysqld_safe is started by root.)

Large page support in MySQL is disabled by default. To enable it, start the server with the --large-pages option. For example, you can use the following lines in your server's my.cnf file:

[mysqld]large-pages

With this option, InnoDB uses large pages automatically for its buffer pool and additional memory pool. If InnoDB cannot do this, it falls back to use of traditional memory and writes a warning to the error log: Warning: Using conventional memory pool

To verify that large pages are being used, check /proc/meminfo again:

shell> cat /proc/meminfo | grep -i hugeHugePages_Total:  20HugePages_Free:   20HugePages_Rsvd: 2HugePages_Surp: 0Hugepagesize:   4096 kB

8.11.5. Optimizing Network Use

8.11.5.1. How MySQL Uses Threads for Client Connections
8.11.5.2. DNS Lookup Optimization and the Host Cache

8.11.5.1. How MySQL Uses Threads for Client Connections

Connection manager threads handle client connection requests on the network interfaces that the server listens to. On all platforms, one manager thread handles TCP/IP connection requests. On Unix, this manager thread also handles Unix socket file connection requests. On Windows, a manager thread handles shared-memory connection requests, and another handles named-pipe connection requests. The server does not create threads to handle interfaces that it does not listen to. For example, a Windows server that does not have support for named-pipe connections enabled does not create a thread to handle them.

By default, connection manager threads associate each client connection with a thread dedicated to it that handles authentication and request processing for that connection. Manager threads create a new thread when necessary but try to avoid doing so by consulting the thread cache first to see whether it contains a thread that can be used for the connection. When a connection ends, its thread is returned to the thread cache if the cache is not full.

In the default connection thread model, there are as many threads as there are clients currently connected, which has some disadvantages when server workload must scale to handle large numbers of connections. For example, thread creation and disposal becomes expensive. Also, each thread requires server and kernel resources, such as stack space. To accommodate a large number of simultaneous connections, the stack size per thread must be kept small, leading to a situation where it is either too small or the server consumes large amounts of memory. Exhaustion of other resources can occur as well, and scheduling overhead can become significant.

As of MySQL 5.5.16, commercial distributions of MySQL include a thread pool plugin provides an alternative thread-handling model designed to reduce overhead and improve performance. It implements a thread pool that increases server performance by efficiently managing statement execution threads for large numbers of client connections. See Section 8.11.6, "The Thread Pool Plugin".

To control and monitor how the server manages threads that handle client connections, several system and status variables are relevant. (See Section 5.1.4, "Server System Variables", and Section 5.1.6, "Server Status Variables".)

The thread cache has a size determined by the thread_cache_size system variable. The default value is 0 (no caching), which causes a thread to be set up for each new connection and disposed of when the connection terminates. Set thread_cache_size to N to enable N inactive connection threads to be cached. thread_cache_size can be set at server startup or changed while the server runs. A connection thread becomes inactive when the client connection with which it was associated terminates.

To monitor the number of threads in the cache and how many threads have been created because a thread could not be taken from the cache, monitor the Threads_cached and Threads_created status variables.

You can set max_connections at server startup or at runtime to control the maximum number of clients that can connect simultaneously.

When the thread stack is too small, this limits the complexity of the SQL statements which the server can handle, the recursion depth of stored procedures, and other memory-consuming actions. To set a stack size of N bytes for each thread, start the server with --thread_stack=N.

8.11.5.2. DNS Lookup Optimization and the Host Cache

The MySQL server maintains a host cache in memory that contains information about clients: IP address, host name, and error information. The server uses this cache for nonlocal TCP connections. It does not use the cache for TCP connections established using a loopback interface address (127.0.0.1 or ::1), or for connections established using a Unix socket file, named pipe, or shared memory.

For each new client connection, the server uses the client IP address to check whether the client host name is in the host cache. If not, the server attempts to resolve the host name. First, it resolves the IP address to a host name and resolves that host name back to an IP address. Then it compares the result to the original IP address to ensure that they are the same. The server stores information about the result of this operation in the host cache. If the cache is full, the least recently used entry is discarded.

The server performs host name resolution using the thread-safe gethostbyaddr_r() and gethostbyname_r() calls if the operating system supports them. Otherwise, the thread performing the lookup locks a mutex and calls gethostbyaddr() and gethostbyname() instead. In this case, no other thread can resolve host names that are not in the host cache until the thread holding the mutex lock releases it.

The server uses the host cache for several purposes:

By caching the results of IP-to-host name lookups, the server avoids doing a DNS lookup for each client connection. Instead, for a given host, it needs to perform a lookup only for the first connection from that host.
The cache contains information about errors that occur during the connection process. Some errors are considered "blocking." If too many of these occur successively from a given host without a successful connection, the server blocks further connections from that host. The max_connect_errors system variable determines the number of permitted errors before blocking occurs. See Section C.5.2.6, "Host 'host_name' is blocked".

To unblock blocked hosts, flush the host cache by issuing a FLUSH HOSTS statement or executing a mysqladmin flush-hosts command.

It is possible for a blocked host to become unblocked even without FLUSH HOSTS if activity from other hosts has occurred since the last connection attempt from the blocked host. This can occur because the server discards the least recently used cache entry to make room for a new entry if the cache is full when a connection arrives from a client IP not in the cache. If the discarded entry is for a blocked host, that host becomes unblocked.

The host cache is enabled by default. To disable it, start the server with the --skip-host-cache option.

To disable DNS host name lookups, start the server with the --skip-name-resolve option. In this case, the server uses only IP addresses and not host names to match connecting hosts to rows in the MySQL grant tables. Only accounts specified in those tables using IP addresses can be used.

If you have a very slow DNS and many hosts, you might be able to improve performance either by disabling DNS lookups with --skip-name-resolve or by increasing the HOST_CACHE_SIZE define (default value: 128) and recompiling the server

To disallow TCP/IP connections entirely, start the server with the --skip-networking option.

Some connection errors are not associated with TCP connections, occur very early in the connection process (even before an IP address is known), or are not specific to any particular IP address (such as out-of-memory conditions).

8.11.6. The Thread Pool Plugin

8.11.6.1. Thread Pool Components and Installation
8.11.6.2. Thread Pool Operation
8.11.6.3. Thread Pool Tuning

Note

MySQL Thread Pool is a commercial extension. To learn more about commercial products (MySQL Enterprise Edition), see http://www.mysql.com/products/.

As of MySQL 5.5.16, commercial distributions of MySQL include MySQL Thread Pool, implemented using a server plugin. The default thread-handling model in MySQL Server executes statements using one thread per client connection. As more clients connect to the server and execute statements, overall performance degrades. The thread pool plugin provides an alternative thread-handling model designed to reduce overhead and improve performance. The plugin implements a thread pool that increases server performance by efficiently managing statement execution threads for large numbers of client connections.

The thread pool addresses several problems of the one thread per connection model:

Too many thread stacks make CPU caches almost useless in highly parallel execution workloads. The thread pool promotes thread stack reuse to minimize the CPU cache footprint.
With too many threads executing in parallel, context switching overhead is high. This also presents a challenging task to the operating system scheduler. The thread pool controls the number of active threads to keep the parallelism within the MySQL server at a level that it can handle and that is appropriate for the server host on which MySQL is executing.
Too many transactions executing in parallel increases resource contention. In InnoDB, this increases the time spent holding central mutexes. The thread pool controls when transactions start to ensure that not too many execute in parallel.

The thread pool plugin is a commercial feature. It is not included in MySQL community distributions.

On Windows, the thread pool plugin requires Windows Vista or newer. On Linux, the plugin requires kernel 2.6.9 or newer.

Additional Resources

Section B.14, "MySQL 5.5 FAQ: MySQL Enterprise Scalability Thread Pool"

8.11.6.1. Thread Pool Components and Installation

The thread pool feature comprises these components:

A plugin library object file contains a plugin for the thread pool code and plugins for several INFORMATION_SCHEMA tables.
For a detailed description of how the thread pool works, see Section 8.11.6.2, "Thread Pool Operation".
The INFORMATION_SCHEMA tables are named TP_THREAD_STATE, TP_THREAD_GROUP_STATE, and TP_THREAD_GROUP_STATS. These tables provide information about thread pool operation. For more information, see Section 20.30, "Thread Pool INFORMATION_SCHEMA Tables".
Several system variables are related to the thread pool. The thread_handling system variable has a value of loaded-dynamically when the server successfully loads the thread pool plugin.
The other related variables are implemented by the thread pool plugin; they are not available unless it is enabled:
- thread_pool_algorithm: The concurrency algorithm to use for scheduling.
- thread_pool_high_priority_connection: How to schedule statement execution for a session.
- thread_pool_prio_kickup_timer: How long before the thread pool moves a statement awaiting execution from the low-priority queue to the high-priority queue.
- thread_pool_max_unused_threads: How many sleeping threads to permit.
- thread_pool_size: The number of thread groups in the thread pool. This is the most important parameter controlling thread pool performance.
- thread_pool_stall_limit: The time before an executing statement is considered to be stalled.
If any variable implemented by the plugin is set to an illegal value at startup, plugin initialization fails and the plugin does not load.
For information about setting thread pool parameters, see Section 8.11.6.3, "Thread Pool Tuning".
The Performance Schema exposes information about the thread pool and may be used to investigate operational performance. For more information, see Chapter 21, MySQL Performance Schema.

To be usable by the server, the thread pool library object file must be located in the MySQL plugin directory (the directory named by the plugin_dir system variable). To enable thread pool capability, load the plugins to be used by starting the server with the --plugin-load option. For example, if you name just the plugin object file, the server loads all plugins that it contains (that is, the thread pool plugin and all the INFORMATION_SCHEMA tables). To do this, put these lines in your my.cnf file:

[mysqld]plugin-load=thread_pool.so

If object files have a suffix different from .so on your system, substitute the correct suffix (for example, .dll on Windows).

If necessary, set the value of the plugin_dir system variable to tell the server the location of the plugin directory.

If desired, you can load individual plugins from the library file. To load the thread pool plugin but not the INFORMATION_SCHEMA tables, use an option like this:

[mysqld]plugin-load=thread_pool=thread_pool.so

To load the thread pool plugin and only the TP_THREAD_STATE INFORMATION_SCHEMA table, use an option like this:

[mysqld]plugin-load=thread_pool=thread_pool.so;TP_THREAD_STATE=thread_pool.so

To verify plugin installation, examine the INFORMATION_SCHEMA.PLUGINS table or use the SHOW PLUGINS statement. See Section 5.1.8.2, "Obtaining Server Plugin Information".

If the server loads the thread plugin successfully, it sets the thread_handling system variable to dynamically-loaded. If the plugin fails to load, the server writes a message to the error log.

8.11.6.2. Thread Pool Operation

The thread pool consists of a number of thread groups, each of which manages a set of client connections. As connections are established, the thread pool assigns them to thread groups in round-robin fashion.

The number of thread groups is configurable using the thread_pool_size system variable. The default number of groups is 16. For guidelines on setting this variable, see Section 8.11.6.3, "Thread Pool Tuning".

The maximum number of threads per group is 4096 (or 4095 on some systems where one thread is used internally).

The thread pool separates connections and threads, so there is no fixed relationship between connections and the threads that execute statements received from those connections. This differs from the default thread-handling model that associates one thread with one connection such that the thread executes all statements from the connection.

The thread pool tries to ensure a maximum of one thread executing in each group at any time, but sometimes permits more threads to execute temporarily for best performance. The algorithm works in the following manner:

Each thread group has a listener thread that listens for incoming statements from the connections assigned to the group. When a statement arrives, the thread group either begins executing it immediately or queues it for later execution:
- Immediate execution occurs if the statement is the only one received and no statements are queued or currently executing.
- Queuing occurs if the statement cannot begin executing immediately.
If immediate execution occurs, execution is performed by the listener thread. (This means that temporarily no thread in the group is listening.) If the statement finishes quickly, the executing thread returns to listening for statements. Otherwise, the thread pool considers the statement stalled and starts another thread as a listener thread (creating it if necessary). To ensure that no thread group becomes blocked by stalled statements, the thread pool has a background thread that regularly monitors thread group states.
By using the listening thread to execute a statement that can begin immediately, there is no need to create an additional thread if the statement finishes quickly. This ensures the most efficient execution possible in the case of a low number of concurrent threads.
When the thread pool plugin starts, it creates one thread per group (the listener thread), plus the background thread. Additional threads are created as necessary to execute statements.
The value of the thread_pool_stall_limit system variable determines the meaning of "finishes quickly" in the previous item. The default time before threads are considered stalled is 60ms but can be set to a maximum of 6s. This parameter is configurable to enable you to strike a balance appropriate for the server work load. Short wait values permit threads to start more quickly. Short values are also better for avoiding deadlock situations. Long wait values are useful for workloads that include long-running statements, to avoid starting too many new statements while the current ones execute.
The thread pool focuses on limiting the number of concurrent short-running statements. Before an executing statement reaches the stall time, it prevents other statements from beginning to execute. If the statement executes past the stall time, it is permitted to continue but no longer prevents other statements from starting. In this way, the thread pool tries to ensure that in each thread group there is never more than one short-running statement, although there might be multiple long-running statements. It is undesirable to let long-running statements prevent other statements from executing because there is no limit on the amount of waiting that might be necessary. For example, on a replication master, a thread that is sending binary log events to a slave effectively runs forever.
A statement becomes blocked if it encounters a disk I/O operation or a user level lock (row lock or table lock). The block would cause the thread group to become unused, so there are callbacks to the thread pool to ensure that the thread pool can immediately start a new thread in this group to execute another statement. When a blocked thread returns, the thread pool permits it to restart immediately.
There are two queues, a high-priority queue and a low-priority queue. The first statement in a transaction goes to the low-priority queue. Any following statements for the transaction go to the high-priority queue if the transaction is ongoing (statements for it have begun executing), or to the low-priority queue otherwise. Queue assignment can be affected by enabling the thread_pool_high_priority_connection system variable, which causes all queued statements for a session to go into the high-priority queue.
Statements for a nontransactional storage engine, or a transactional engine if autocommit is enabled, are treated as low-priority statements because in this case each statement is a transaction. Thus, given a mix of statements for InnoDB and MyISAM tables, the thread pool prioritizes those for InnoDB over those for MyISAM unless autocommit is enabled. With autocommit enabled, all statements will be low priority.
When the thread group selects a queued statement for execution, it first looks in the high-priority queue, then in the low-priority queue. If a statement is found, it is removed from its queue and begins to execute.
If a statement stays in the low-priority queue too long, the thread pool moves to the high-priority queue. The value of the thread_pool_prio_kickup_timer system variable controls the time before movement. For each thread group, a maximum of one statement per 10ms or 100 per second will be moved from the low-priority queue to the high-priority queue.
The thread pool reuses the most active threads to obtain a much better use of CPU caches. This is a small adjustment that has a great impact on performance.
While a thread executes a statement from a user connection, Performance Schema instrumentation accounts thread activity to the user connection. Otherwise, Performance Schema accounts activity to the thread pool.

Here are examples of conditions under which a thread group might have multiple threads started to execute statements:

One thread begins executing a statement, but runs long enough to be considered stalled. The thread group permits another thread to begin executing another statement even through the first thread is still executing.
One thread begins executing a statement, then becomes blocked and reports this back to the thread pool. The thread group permits another thread to begin executing another statement.
One thread begins executing a statement, becomes blocked, but does not report back that it is blocked because the block does not occur in code that has been instrumented with thread pool callbacks. In this case, the thread appears to the thread group to be still running. If the block lasts long enough for the statement to be considered stalled, the group permits another thread to begin executing another statement.

The thread pool is designed to be scalable across an increasing number of connections. It is also designed to avoid deadlocks that can arise from limiting the number of actively executing statements. It is important that threads that do not report back to the thread pool do not prevent other statements from executing and thus cause the thread pool to become deadlocked. Examples of such statements follow:

Long-running statements. These would lead to all resources used by only a few statements and they could prevent all others from accessing the server.
Binary log dump threads that read the binary log and send it to slaves. This is a kind of long-running "statement" that runs for a very long time, and that should not prevent other statements from executing.
Statements blocked on a row lock, table lock, sleep, or any other blocking activity that has not been reported back to the thread pool by MySQL Server or a storage engine.

In each case, to prevent deadlock, the statement is moved to the stalled category when it does not complete quickly, so that the thread group can permit another statement to begin executing. With this design, when a thread executes or becomes blocked for an extended time, the thread pool moves the thread to the stalled category and for the rest of the statement's execution, it does not prevent other statements from executing.

The maximum number of threads that can occur is the sum of max_connections and thread_pool_size. This can happen in a situation where all connections are in execution mode and an extra thread is created per group to listen for more statements. This is not necessarily a state that happens often, but it is theoretically possible.

8.11.6.3. Thread Pool Tuning

This section provides guidelines on setting thread pool system variables for best performance, measured using a metric such as transactions per second.

thread_pool_size is the most important parameter controlling thread pool performance. It can be set only at server startup. Our experience in testing the thread pool indicates the following:

If the primary storage engine is InnoDB, the optimal thread_pool_size setting is likely to be between 16 and 36, with the most common optimal values tending to be from 24 to 36. We have not seen any situation where the setting has been optimal beyond 36. There may be special cases where a value smaller than 16 is optimal.
For workloads such as DBT2 and Sysbench, the optimum for InnoDB seems to be usually around 36. For very write-intensive workloads, the optimal setting can sometimes be lower.
If the primary storage engine is MyISAM, the thread_pool_size setting should be fairly low. We tend to get optimal performance for values from 4 to 8. Higher values tend to have a slightly negative but not dramatic impact on performance.

Another system variable, thread_pool_stall_limit, is important for handling of blocked and long-running statements. If all calls that block the MySQL Server are reported to the thread pool, it would always know when execution threads are blocked. However, this may not always be true. For example, blocks could occur in code that has not been instrumented with thread pool callbacks. For such cases, the thread pool must be able to identify threads that appear to be blocked. This is done by means of a timeout, the length of which can be tuned using the thread_pool_stall_limit system variable. This parameter ensures that the server does not become completely blocked. The value of thread_pool_stall_limit has an upper limit of 6 seconds to prevent the risk of a deadlocked server.

thread_pool_stall_limit also enables the thread pool to handle long-running statements. If a long-running statement was permitted to block a thread group, all other connections assigned to the group would be blocked and unable to start execution until the long-running statement completed. In the worst case, this could take hours or even days.

The value of thread_pool_stall_limit should be chosen such that statements that execute longer than its value are considered stalled. Stalled statements generate a lot of extra overhead since they involve extra context switches and in some cases even extra thread creations. On the other hand, setting the thread_pool_stall_limit parameter too high means that long-running statements will block a number of short-running statements for longer than necessary. Short wait values permit threads to start more quickly. Short values are also better for avoiding deadlock situations. Long wait values are useful for workloads that include long-running statements, to avoid starting too many new statements while the current ones execute.

Suppose a server executes a workload where 99.9% of the statements complete within 100ms even when the server is loaded, and the remaining statements take between 100ms and 2 hours fairly evenly spread. In this case, it would make sense to set thread_pool_stall_limit to 10 (meaning 100ms). The default value of 60ms is okay for servers that primarily execute very simple statements.

The thread_pool_stall_limit parameter can be changed at runtime to enable you to strike a balance appropriate for the server work load. Assuming that the TP_THREAD_GROUP_STATS table is enabled, you can use the following query to determine the fraction of executed statements that stalled:

SELECT SUM(STALLED_QUERIES_EXECUTED) / SUM(QUERIES_EXECUTED)FROM information_schema.TP_THREAD_GROUP_STATS;

This number should be as low as possible. To decrease the likelihood of statements stalling, increase the value of thread_pool_stall_limit.

When a statement arrives, what is the maximum time it can be delayed before it actually starts executing? Suppose that the following conditions apply:

There are 200 statements queued in the low-priority queue.
There are 10 statements queued in the high-priority queue.
thread_pool_prio_kickup_timer is set to 10000 (10 seconds).
thread_pool_stall_limit is set to 100 (1 second).

In the worst case, the 10 high-priority statements represent 10 transactions that continue executing for a long time. Thus, in the worst case, no statements will be moved to the high-priority queue because it will always already contain statements awaiting execution. After 10 seconds, the new statement is eligible to be moved to the high-priority queue. However, before it can be moved, all the statements before it must be moved as well. This could take another 2 seconds because a maximum of 100 statements per second are moved to the high-priority queue. Now when the statement reaches the high-priority queue, there could potentially be many long-running statements ahead of it. In the worst case, every one of those will become stalled and it will take 1 second for each statement before the next statement is retrieved from the high-priority queue. Thus, in this scenario, it will take 222 seconds before the new statement starts executing.

This example shows a worst case for an application. How to handle it depends on the application. If the application has high requirements for the response time, it should most likely throttle users at a higher level itself. Otherwise, it can use the thread pool configuration parameters to set some kind of a maximum waiting time.

(Sebelumnya) 8.5. Optimizing for InnoDB Tables

8.12. Measuring Performance (B ... (Berikutnya)