Saturday, 23 June 2012

MySQL - capturing diagnostic data

Sporadic problems such as occasional server stalls can be difficult to diagnose. It is important to capture good diagnostic data when the problem is clearly happening. And it gets frustrating when you don't know when next it will hit your database. So, we need to find a way to know when problem  happens,Let's call it 'diagnostic trigger'. We need to find good indicator e.g. sudden sever's load spike and/or spike in status varaibles such as Threads_connected, Threads_running.
 Thanks to perconna Toolkit, pt-stalk - Watches for a trigger condition to become true, and then collects data to help in diagnosing problems. I performed few quick test using pt-stalk:
At the time of writing this article percona-toolki2.1.2 is the latest release:

You might have to install following required by percona-toolkit:
yum install perl-Time-HiRes.x86_64
yum install perl-TermReadKey.x86_64

Quick Examples:

A) Collect diagnostic data when at least 40 threads are running,

# pt-stalk  --function status --variable Threads_running \
--threshold 40 --prefix=mysql_trouble --prefix=mysql_trouble --cycles=2-- \
-uroot -pxxxx
2012_06_23_10_13_11 Check results: Threads_running=1, matched=no, cycles_true=0
2012_06_23_10_13_12 Check results: Threads_running=1, matched=no, cycles_true=0
2012_06_23_10_13_14 Check results: Threads_running=1, matched=no, cycles_true=0
2012_06_23_10_13_15 Check results: Threads_running=40, matched=yes, cycles_true=1
2012_06_23_10_13_16 Check results: Threads_running=41, matched=yes, cycles_true=2
2012_06_23_10_13_16 Collect triggered
2012_06_23_10_13_16 Collector PID 16113
2012_06_23_10_13_16 Sleeping 300 seconds after collect 
And the stored diagnostic data is available in its default location:
# ls -lh /var/lib/pt-stalk/
total 572K
-rw-r--r-- 1 root root 4.5K Jun 23 11:08 mysql_trouble-df
-rw-r--r-- 1 root root  153 Jun 23 11:08 mysql_trouble-disk-space
-rw-r--r-- 1 root root  16K Jun 23 11:08 mysql_trouble-diskstats
-rw-r--r-- 1 root root  15K Jun 23 11:08 mysql_trouble-top
-rw-r--r-- 1 root root  386 Jun 23 11:08 mysql_trouble-trigger
-rw-r--r-- 1 root root 8.1K Jun 23 11:08 mysql_trouble-variables
-rw-r--r-- 1 root root  992 Jun 23 11:08 mysql_trouble-vmstat
-rw-r--r-- 1 root root  245 Jun 23 11:08 mysql_trouble-vmstat-overall

B) Collect data when at least 20 queries running in the 'copying to tmp table' State.

pt-stalk  --function processlist --variable State \
--match Copying --threshold 20 --prefix=mysql_trouble --cycles=2 \
-- -uroot -pxxxxx
2012_06_23_11_54_19 Check results: State=21, matched=yes, cycles_true=1
2012_06_23_11_54_20 Check results: State=21, matched=yes, cycles_true=2
2012_06_23_11_54_20 Collect triggered
2012_06_23_11_54_21 Collector PID 9154
2012_06_23_11_54_21 Sleeping 300 seconds after collect 

You will probably like to run this tool as daemon - Here’s a sample configuration file for triggering when there are more than 40 threads running at once:

# Config for pt-stalk
cycles=2  # trigger if problem seen twice in a row
Syntax to run pt-stalk as daemon: # pt-stalk --config /etc/pt-stalk.cnf  For more information about configuration files click here

Saturday, 16 June 2012

MySQL - How to scale inserts

Usually we expect inserts with large number of threads would to be faster but this is not always ture. The bottleneck might be the CPU, IO controller ...  MySQL can not fully use available cores/cpus e.g. MySQL 5.0 (Innodb) is limited to 4 cores etc. One of my clients had a problem scaling inserts, they have two data processing clusters each of which use 40 threads - so total 80 threads insert data into MySQL database (version 5.0.51). The issue, inserts are delayed by minute(s) and the backlog continue to grow... After examining innodb monitor status we found long list of transactions waiting for AUTO-INC lock: For example:

------- TRX HAS BEEN WAITING 9 SEC FOR THIS LOCK TO BE GRANTED: TABLE LOCK table `alertxxxx/alertdata` trx id 0 685590474 lock mode AUTO-INC waiting ------------------ ---TRANSACTION 0 685590465, ACTIVE 10 sec, process no 8457, OS thread id 1169045824 setting auto-inc lock

Why AUTO-INC lock?  - When accessing the auto-increment counter, InnoDB uses a special table-level AUTO-INC lock that it keeps to the end of the current SQL statement, not to the end of the transaction. This basically cause all inserts into the same table to serialize. With single row inserts it is normally not too bad but could prevent scalability with multiple threads inserting Bug #16979. However, you can reach better through put (inserts per second) with less threads. So after dropping number of threads on both clusters by 50% initially - taking it to 20-20 sessions. The problem almost disappeared and when we further reduced number of threads to 10-10 sessions, the problem disappeared!

Beginning with MySQL 5.1.22 - new locking model introduced for handling Innodb auto-increment in InnoDB. There is a good article which talks about this here and here

Similarly, if you want to achieve fast insert performance, it can be interesting to load files instead of the loading the inserts one by one : it is 3 to 4 times faster. If the goal is a huge amount of data already known at that time, it is probably the best option.


1. Optimize database structure -  Design your tables to minimize their space on the disk. This can result in huge improvements by reducing the amount of data written to and read from disk. Smaller tables normally require less main memory while their contents are being actively processed during query execution. Any space reduction for table data also results in smaller indexes that can be processed faster. Ensure columns have the right data types and require least amount of storage, you should try to use the most precise type in all cases. For example, if an integer column is used for values in the range from 1 to 99999, MEDIUMINT UNSIGNED is the best type. For more information about optimizing database structure click here. If you store large strings (TEXT) or BLOB, compression may help there is an article about it here 

2. Innodb flush method - e.g. O_DIRECT, if used can help to avoid double buffering between the InnoDB buffer pool and the operating system's filesystem cache. MySQL reference manual explain this here.
[ Warning] O_DIRECT, serializes the writes in ext3. Howerver, impact can be lowered by using innodb_file_per_table)

3. Innodb thread concurrency - Keep the value low for this option variable (default 8 ok), however the correct value for this variable is dependent on environment and workload. This option variable is explained here

4. Innodb buffer pool - Innodb maintains a buffer pool for caching data and indexes in memory. Making the pool larger can improve performance by reducing the amount of disk I/O needed, here is the good article that explains here and here.

5. Innodb log file size- The larger the value, the less checkpoint flush activity is needed in the buffer pool, saving disk I/O. But larger log files also mean that recovery is slower in case of a crash. As of MySQL 5.5, recovery times have been greatly improved and the whole log file flushing algorithm has been improved. In 5.5 you generally want larger log files as recovery is improved. Article about selecting innodb log file size is here

6 Innodb_flush_log_at_trx_commit -. The default value of 1 is the value required for ACID compliance. You can achieve better performance by setting the value different from 1, but then you can lose at most one second worth of transactions in a crash. With a value of 0, any mysqld process crash can erase the last second of transactions. With a value of 2, then only an operating system crash or a power outage can erase the last second of transactions. However, InnoDB's crash recovery is not affected and thus crash recovery does work regardless of the value.

7. Filesystem selection and disk issues - 'xfs' is known to perform very well with MySQL. Also writing the redo logs, binary logs, data files in different physical disks is a good practice with a bigger gain than server configuration. RAID 10 is recommended for best performance, more detail about disk issue can be found here and here