Database Administration: MySQL - capturing diagnostic data

Sporadic problems such as occasional server stalls can be difficult to diagnose. It is important to capture good diagnostic data when the problem is clearly happening. And it gets frustrating when you don't know when next it will hit your database. So, we need to find a way to know when problem happens,Let's call it 'diagnostic trigger'. We need to find good indicator e.g. sudden sever's load spike and/or spike in status varaibles such as Threads_connected, Threads_running.
Thanks to perconna Toolkit, pt-stalk - Watches for a trigger condition to become true, and then collects data to help in diagnosing problems. I performed few quick test using pt-stalk:
Installation:
At the time of writing this article percona-toolki2.1.2 is the latest release:
wget http://www.percona.com/downloads/percona-toolkit/2.1.2/percona-toolkit-2.1.2-1.noarch.rpm

You might have to install following required by percona-toolkit:
yum install perl-Time-HiRes.x86_64
yum install perl-TermReadKey.x86_64

Quick Examples:

A) Collect diagnostic data when at least 40 threads are running,


# pt-stalk  --function status --variable Threads_running \
--threshold 40 --prefix=mysql_trouble --prefix=mysql_trouble --cycles=2-- \
-uroot -pxxxx
....
2012_06_23_10_13_11 Check results: Threads_running=1, matched=no, cycles_true=0
2012_06_23_10_13_12 Check results: Threads_running=1, matched=no, cycles_true=0
2012_06_23_10_13_14 Check results: Threads_running=1, matched=no, cycles_true=0
2012_06_23_10_13_15 Check results: Threads_running=40, matched=yes, cycles_true=1
2012_06_23_10_13_16 Check results: Threads_running=41, matched=yes, cycles_true=2
2012_06_23_10_13_16 Collect triggered
2012_06_23_10_13_16 Collector PID 16113
2012_06_23_10_13_16 Sleeping 300 seconds after collect

And the stored diagnostic data is available in its default location:

# ls -lh /var/lib/pt-stalk/
total 572K
-rw-r--r-- 1 root root 4.5K Jun 23 11:08 mysql_trouble-df
-rw-r--r-- 1 root root  153 Jun 23 11:08 mysql_trouble-disk-space
-rw-r--r-- 1 root root  16K Jun 23 11:08 mysql_trouble-diskstats
...
-rw-r--r-- 1 root root  15K Jun 23 11:08 mysql_trouble-top
-rw-r--r-- 1 root root  386 Jun 23 11:08 mysql_trouble-trigger
-rw-r--r-- 1 root root 8.1K Jun 23 11:08 mysql_trouble-variables
-rw-r--r-- 1 root root  992 Jun 23 11:08 mysql_trouble-vmstat
-rw-r--r-- 1 root root  245 Jun 23 11:08 mysql_trouble-vmstat-overall

B) Collect data when at least 20 queries running in the 'copying to tmp table' State.

pt-stalk  --function processlist --variable State \
--match Copying --threshold 20 --prefix=mysql_trouble --cycles=2 \
-- -uroot -pxxxxx
..
2012_06_23_11_54_19 Check results: State=21, matched=yes, cycles_true=1
2012_06_23_11_54_20 Check results: State=21, matched=yes, cycles_true=2
2012_06_23_11_54_20 Collect triggered
2012_06_23_11_54_21 Collector PID 9154
2012_06_23_11_54_21 Sleeping 300 seconds after collect

You will probably like to run this tool as daemon - Here’s a sample configuration file for triggering when there are more than 40 threads running at once:

# Config for pt-stalk
variable=Threads_running
cycles=2  # trigger if problem seen twice in a row
dest=/var/log/mysql_diagnostics
threshold=40
daemonize
--
--user=monitoring
--password=xxxx

Syntax to run pt-stalk as daemon:
# pt-stalk --config /etc/pt-stalk.cnf
 For more information about configuration files click here

Database Administration

Saturday, 23 June 2012

MySQL - capturing diagnostic data

Quick Examples:

A) Collect diagnostic data when at least 40 threads are running,

B) Collect data when at least 20 queries running in the 'copying to tmp table' State.

You will probably like to run this tool as daemon - Here’s a sample configuration file for triggering when there are more than 40 threads running at once:

No comments:

Post a Comment

About Me

Blog Archive