Friday 22 May 2015

MySQL cannot use indexes - common mistake

I recently helped one customer to help resolve performance issue. The problem was caused by a very simple update query that modifies only one record:

EXPLAIN UPDATE PROD_CHECK_MODULES SET DATE_CHECK = NOW() WHERE COUGAR_NAME='cougar-p01' AND ID_MODULE=4 AND NUM_MODULE=03;
+------+-------------+--------------------+-------+---------------+-----------+---------+------+------+-------------+
| id   | select_type | table              | type  | possible_keys | key       | key_len | ref  | rows | Extra       |
+------+-------------+--------------------+-------+---------------+-----------+---------+------+------+-------------+
|    1 | SIMPLE      | PROD_CHECK_MODULES | range | ID_MODULE     | ID_MODULE | 4       | NULL |    6 | Using where |
+------+-------------+--------------------+-------+---------------+-----------+---------+------+------+-------------+
1 row in set (0.00 sec)


 SHOW CREATE TABLE PROD_CHECK_MODULES\G
*************************** 1. row ***************************
....
`COUGAR_NAME` varchar(64) NOT NULL,
  `ID_MODULE` int(11) NOT NULL,
  `NUM_MODULE` char(3) NOT NULL
...
UNIQUE KEY `COUGAR_NAME` (`COUGAR_NAME`,`ID_MODULE`,`NUM_MODULE`),
KEY `ID_MODULE` (`ID_MODULE`)
.....
Ideally it should be using "unique key" to examine/update 1 row. Table in question has got < 200 rows. However, it still does not look too bad "6 rows vs 1 rows" but it blocked large number of similar update queries due to long running transaction.  The first thing done was to identify and kill long running transaction that was blocking other 'update' queries. and then tune 'update' query. 
The issue here is that value 03 cannot be compared to column NUM_MODULE without data type conversion i.e. we are comparing numeric value with string column, that is why MySQL could not use available and most suitable index . We had two solutions to fix this:

a) Modify query to use quotes with a value e.g. 
NUM_MODULE="03"
b) Modify table structure to use type integer 
NUM_MODULE SMALLINT NOT NULL
The easy fix was to modify table structure, because all NUM_MODULE values  are numeric and table is quite small.

Monday 11 May 2015

Tracking long running processes in MySQL

Tracking long running processes in MySQL is not difficult. A process can be:
  • SQL query
  • Transaction 
There are different ways of tracking both type of processes.. First we would look at some of the available methods to identify long running queries:
  • slow query log
  • [pt-kill]
  • performance schema
  • packet inspection

Slow query log - This is the common method of identifying slow queries. You would need to check if its already enabled:

mysql> SHOW GLOBAL VARIABLES WHERE VARIABLE_NAME LIKE 'slow_query%' OR VARIABLE_NAME LIKE 'long_query_time';
+---------------------+----------------------------------+
| Variable_name       | Value                            |
+---------------------+----------------------------------+
| long_query_time     | 10.000000                        |
| slow_query_log      | OFF                               |
| slow_query_log_file | /data/mysql/log/mariadb-slow.log |
+---------------------+----------------------------------+
3 rows in set (0.01 sec)

In this example, it is not. To enable slow query log, issue following commands: 
mysql> SET GLOBAL slow_query_log=ON;
Query OK, 0 rows affected (0.00 sec)
mysql> set global long_query_time=1.0;
Query OK, 0 rows affected (0.00 sec)
I took the opportunity to reduce long_query_time value from 10 seconds to just 1 second as well. The idea is to capture all queries that take at least 1 second. To make those changes persistent you would have to modify MySQL configuration file (e.g. /etc/my.cnf or my.ini) 
[mysqld]
..
slow_query_log  = ON
long_query_time = 1
Next you would like to summarize slow query log,  you have two famous methods to choose from:
mysqldumpslow - classic method that have been used since mysql is born, on-line docs explain it very well, click here
pt-query-digest - This method has been widely adopted by the community, click here to learn more about it. 

If you are still using older release of MySQL e.g. 5.0 or earlier. It is not possible to enable slow query logging without rebooting MySQLd process. If you cannot restart MySQL process, you can still track slow queries using pt-kill utility:
$ pt-kill --host=localhost -udba --ask-pass --busy-time 1 --print --daemonize --log /var/log/slow-query.log # --busy-time decides queries that have been running for longer than this time
Remember NOT to specify --kill option, as it would start killing your queries

Performance schema - It was introduced in version 5.5, by default it is enabled in 5.6+ release. It provides a way to inspect internal server events (such as IO activity, locks, query profiling etc). It can also be used to track timing on long running processes.  Below are few examples of the commands that can be used to track long running queries, queries that use temporary table and/or the ones that perform full-table scan:
###
## A high level overview of the statements like Query Analysis, sorted by those queries with the highest latency
##
 
SELECT DIGEST, DIGEST_TEXT AS query,
       first_seen,
    last_seen,
       COUNT_STAR AS exec_count,
       SUM_NO_INDEX_USED AS no_index_used_count,
       SUM_NO_GOOD_INDEX_USED AS no_good_index_used_count,
       ROUND((SUM_NO_INDEX_USED / COUNT_STAR) * 100) no_index_used_pct
      FROM performance_schema.events_statements_summary_by_digest
 WHERE SUM_NO_INDEX_USED > 0
    OR SUM_NO_GOOD_INDEX_USED > 0
ORDER BY no_index_used_pct DESC, exec_count DESC LIMIT 5\G 
 
###
# List all normalized statements that use temporary tables ordered by number of on disk temporary tables descending first, then by the number of memory tables.
###
 
SELECT IF(LENGTH(DIGEST_TEXT) > 64, CONCAT(LEFT(DIGEST_TEXT, 30), ' ... ', RIGHT(DIGEST_TEXT, 30)), DIGEST_TEXT) AS query,
       COUNT_STAR AS exec_count,
       SUM_CREATED_TMP_TABLES AS memory_tmp_tables,
       SUM_CREATED_TMP_DISK_TABLES AS disk_tmp_tables,
       ROUND(SUM_CREATED_TMP_TABLES / COUNT_STAR) AS avg_tmp_tables_per_query,
       ROUND((SUM_CREATED_TMP_DISK_TABLES / SUM_CREATED_TMP_TABLES) * 100) AS tmp_tables_to_disk_pct,
       DIGEST AS digest
  FROM performance_schema.events_statements_summary_by_digest
 WHERE SUM_CREATED_TMP_TABLES > 0
ORDER BY SUM_CREATED_TMP_DISK_TABLES DESC, SUM_CREATED_TMP_TABLES DESC LIMIT 5;
 
###
# List all normalized statements that use have done a full table scan ordered by the percentage of times a full scan was done, then by the number of times the statement executed
###
 
SELECT IF(LENGTH(DIGEST_TEXT) > 64, CONCAT(LEFT(DIGEST_TEXT, 30), ' ... ', RIGHT(DIGEST_TEXT, 30)), DIGEST_TEXT) AS query,
       COUNT_STAR AS exec_count,
       SUM_NO_INDEX_USED AS no_index_used_count,
       SUM_NO_GOOD_INDEX_USED AS no_good_index_used_count,
       ROUND((SUM_NO_INDEX_USED / COUNT_STAR) * 100) no_index_used_pct,
       DIGEST AS digest
  FROM performance_schema.events_statements_summary_by_digest
 WHERE SUM_NO_INDEX_USED > 0
    OR SUM_NO_GOOD_INDEX_USED > 0
ORDER BY no_index_used_pct DESC, exec_count DESC LIMIT 5;
Packet inspection - this can be achieved using tcpdump utility, not covered in this article. This is due to its known limitations:
  • you can only observe queries sent over the network.
  • you don’t see queries sent through Unix sockets or via replication. 
  • you also can’t see queries run from stored procedures.
  • security concern, this method requires root access.
 if you are keen to explore this method of tracking slow queries click here and here 

In the next article, I would blog about how to identify long running transactions...