Database Administration: myisam

Showing posts with label myisam. Show all posts

Saturday, 23 June 2012

MySQL - capturing diagnostic data

Sporadic problems such as occasional server stalls can be difficult to diagnose. It is important to capture good diagnostic data when the problem is clearly happening. And it gets frustrating when you don't know when next it will hit your database. So, we need to find a way to know when problem happens,Let's call it 'diagnostic trigger'. We need to find good indicator e.g. sudden sever's load spike and/or spike in status varaibles such as Threads_connected, Threads_running.
Thanks to perconna Toolkit, pt-stalk - Watches for a trigger condition to become true, and then collects data to help in diagnosing problems. I performed few quick test using pt-stalk:
Installation:
At the time of writing this article percona-toolki2.1.2 is the latest release:
wget http://www.percona.com/downloads/percona-toolkit/2.1.2/percona-toolkit-2.1.2-1.noarch.rpm

You might have to install following required by percona-toolkit:
yum install perl-Time-HiRes.x86_64
yum install perl-TermReadKey.x86_64

Quick Examples:

A) Collect diagnostic data when at least 40 threads are running,


# pt-stalk  --function status --variable Threads_running \
--threshold 40 --prefix=mysql_trouble --prefix=mysql_trouble --cycles=2-- \
-uroot -pxxxx
....
2012_06_23_10_13_11 Check results: Threads_running=1, matched=no, cycles_true=0
2012_06_23_10_13_12 Check results: Threads_running=1, matched=no, cycles_true=0
2012_06_23_10_13_14 Check results: Threads_running=1, matched=no, cycles_true=0
2012_06_23_10_13_15 Check results: Threads_running=40, matched=yes, cycles_true=1
2012_06_23_10_13_16 Check results: Threads_running=41, matched=yes, cycles_true=2
2012_06_23_10_13_16 Collect triggered
2012_06_23_10_13_16 Collector PID 16113
2012_06_23_10_13_16 Sleeping 300 seconds after collect

And the stored diagnostic data is available in its default location:

# ls -lh /var/lib/pt-stalk/
total 572K
-rw-r--r-- 1 root root 4.5K Jun 23 11:08 mysql_trouble-df
-rw-r--r-- 1 root root  153 Jun 23 11:08 mysql_trouble-disk-space
-rw-r--r-- 1 root root  16K Jun 23 11:08 mysql_trouble-diskstats
...
-rw-r--r-- 1 root root  15K Jun 23 11:08 mysql_trouble-top
-rw-r--r-- 1 root root  386 Jun 23 11:08 mysql_trouble-trigger
-rw-r--r-- 1 root root 8.1K Jun 23 11:08 mysql_trouble-variables
-rw-r--r-- 1 root root  992 Jun 23 11:08 mysql_trouble-vmstat
-rw-r--r-- 1 root root  245 Jun 23 11:08 mysql_trouble-vmstat-overall

B) Collect data when at least 20 queries running in the 'copying to tmp table' State.

pt-stalk  --function processlist --variable State \
--match Copying --threshold 20 --prefix=mysql_trouble --cycles=2 \
-- -uroot -pxxxxx
..
2012_06_23_11_54_19 Check results: State=21, matched=yes, cycles_true=1
2012_06_23_11_54_20 Check results: State=21, matched=yes, cycles_true=2
2012_06_23_11_54_20 Collect triggered
2012_06_23_11_54_21 Collector PID 9154
2012_06_23_11_54_21 Sleeping 300 seconds after collect

You will probably like to run this tool as daemon - Here’s a sample configuration file for triggering when there are more than 40 threads running at once:

# Config for pt-stalk
variable=Threads_running
cycles=2  # trigger if problem seen twice in a row
dest=/var/log/mysql_diagnostics
threshold=40
daemonize
--
--user=monitoring
--password=xxxx

Syntax to run pt-stalk as daemon:
# pt-stalk --config /etc/pt-stalk.cnf
 For more information about configuration files click here

Wednesday, 18 April 2012

Binary log Synchronization

MySQL sync_binlog option variable is used to control how often binary log may be synchronized to disk. By default, it is not synchronized to disk at each write. So if the operating system or machine (not only the MySQL server) crashes, there is a chance that the last statements of the binary log are lost. To prevent this, you can make the binary log be synchronized to disk after every N writes to the binary log, with the sync_binlog system variable. With value of 1 - every single statement is written to disk, it means that the performance suffers, usually a lot unless the disk has a battery-backed cache enabled, which makes synchronization fast.

I performed some load tests to see how it may hurt database performance when the battery-backed disk cache is missing:

Total 40,000 records inserted into table using 4 parallel db connections:

sync_binlog	time
0	0m17.972s
1	1m4.599s
2	0m44.364s
3	0m34.197s
4	0m18.693s

sync_binlog with value of 3 simply means: only fsync binlog every third transaction.
The higher you increase the value of sync_binlog, the closer the results will be to sync_binlog=0 (no fsyncing).

Value of 1 is safest of course, but as you see, without decent battery backed write caching controller, performance is very bad.

Test case :

Tools used:

-- MySQL 5.1.56
-- mysqlslap
-- platform -
Red Hat 4.1.2-50 x86_64
, RAID 5 (8 disks, 67.75 GB each, Vendor DELL)
, Intel(R) Xeon(R) CPU 5148 @ 2.33GHz, 4 CPUs
, 8G RAM

# Connect to MySQL client and perform:

drop database if exists test_case_9;
create database test_case_9;
use test_case_9;
CREATE TABLE t (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`content` blob,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=800001 DEFAULT CHARSET=utf8;

drop function if exists insert_data;
delimiter //
CREATE DEFINER=`root`@`localhost` FUNCTION `insert_data`() RETURNS int(11)
DETERMINISTIC
BEGIN
INSERT INTO t(id,content) values (null,REPEAT('bla bla',5000));
RETURN ROW_COUNT();
END
//
delimiter ;

-- Change sync_binlog to value 0
mysql> set global sync_binlog=0;

mysql> quit

# Run following steps from a shell prompt

-- load data using mysqlslap tool
$ time mysqlslap --user=root --query="select test_case_9.insert_data();" --concurrency=4 --iterations=10000;
Benchmark
        Average number of seconds to run all queries: 0.001 seconds
        Minimum number of seconds to run all queries: 0.001 seconds
        Maximum number of seconds to run all queries: 0.229 seconds
        Number of clients running queries: 4
        Average number of queries per client: 1

real    0m17.972s
user    0m1.306s
sys     0m3.096s

A) -- Change sync_binlog to value 1 and empty test table 't'

$ mysql -uroot -S /mnt/database/mysql_5.1/mysql.sock test_case_9 -e"truncate table t; set global sync_binlog=1;"

B) -- load data using mysqlslap tool

$ time mysqlslap --user=root --query="select test_case_9.insert_data();" --concurrency=4 --iterations=10000;
Benchmark
        Average number of seconds to run all queries: 0.006 seconds
        Minimum number of seconds to run all queries: 0.003 seconds
        Maximum number of seconds to run all queries: 0.018 seconds
        Number of clients running queries: 4
        Average number of queries per client: 1

real    1m4.239s
user    0m1.672s
sys     0m4.203s

Repeat steps A and B to run load tests using Value 2, 3, and 4 with synch_binlog

Tuesday, 9 November 2010

Short index length (good or bad)

If you find the work load on DB server is disk bound and you do not have enough memory to increase innodb buffer pool size. We can help improve the performance by reducing the length of indexes, this will result more indexes fit into memory and would increase write operations. It may impact SELECT queries but not necessarly for bad. Allow me to show some example. I will lower the index length of my test table from 40 char to just 8 char:
Note: You may consider trying different lengths for your index and check which best fit your workload (both writing and reading speed).

ALTER TABLE csc52021 DROP INDEX value_40, ADD INDEX value_8(value(8));

To look all the rows that have a value between 'aaa' and 'b':

mysql> EXPLAIN SELECT COUNT(*) FROM csc52021 WHERE value&gt;'aaa%' AND
mysql> <'b';
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| 1  | SIMPLE      | csc52021 | range | value_8   | value_8     | 11      | NULL | 9996 | Using where |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM csc52021 WHERE value&gt;'aaa%' AND value&lt;'b';
+----------+
| COUNT(*) |
+----------+
| 5533     |
+----------+
1 row in set (0.02 sec)

As you can see, the index is used.
What if look for a specific value?

mysql> EXPLAIN SELECT COUNT(*) FROM csc52021 WHERE
mysql> value='a816d6ce93c2aa992829f7d0d9357db896d5e7de';
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table    | type | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
| 1  | SIMPLE      | csc52021 | ref  | value_8       | value_8 | 11      | const | 1    | Using where |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM csc52021 WHERE
mysql> value='a816d6ce93c2aa992829f7d0d9357db896d5e7de';
+----------+
| COUNT(*) |
+----------+
| 1        |
+----------+
1 row in set (0.00 sec)

SELECT statements does a lookup for a value that have very low cardinality (for example, if we had millions of rows that starts with the same first 16 char "a816d6ce93c2aa99"), then a short index is not very efficient.
On the other hand, a short index length allows to fit more indexes value in memory, increasing the lookup speed.

In my test table, I can decrease the index length to 4 and still have good performance. For example:

mysql> ALTER TABLE csc52021 DROP INDEX value_8, ADD INDEX
mysql> value_4(value(4));

mysql> EXPLAIN SELECT COUNT(*) FROM csc52021 WHERE
mysql> value='a816d6ce93c2aa992829f7d0d9357db896d5e7de';
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table    | type | possible_keys | key     | key_len | ref   | rows | Extra |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
| 1  | SIMPLE      | csc52021 | ref  | value_4       | value_4 | 7       | const | 2    | Using where |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)

mysql&gt; EXPLAIN SELECT COUNT(*) FROM csc52021 WHERE value LIKE 'a816%';
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| 1  | SIMPLE      | csc52021 | range | value_4       | value_4 | 7       | NULL | 2    | Using where |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql&gt; SELECT COUNT(*) FROM csc52021 WHERE value LIKE 'a816%';
+----------+
| COUNT(*) |
+----------+
| 2        |
+----------+
1 row in set (0.00 sec)

To summarize:
if you have good cardinality, a shorter index length will increase both writes and reads operations, especially if you are IO bound.

Friday, 23 July 2010

Key cache hit ratio

When we talk about key cache performance we usually look at these two status variables.

1. Key_read_requests
The number of requests to read a key block from the cache.

2. Key_reads
The number of physical reads of a key block from disk.

There is a good reason to examine Key_reads, because we know that disks are very slow relative to RAM but the Key_reads aren't always physical disk reads at all. If the requested block of data isn't in the operating system's cache, then a Key_read is a disk read, you will be lucky if it is cached, then it's just a system call. Having said that it is always good to minimize key_reads which may cause randon disk I/O.

The optimum solution is to keep the ratio Key_reads : Key_read_requests should be 1:100 and Key_writes / Key_write_requests should always be less than 1.

Finally I'll like to show you something partially useful you can do with Key_reads:

[xxxxxx ~]$ mysqladmin -uroot -p ext -ri10 | grep Key_reads
Enter password:
| Key_reads | 44605148 |
| Key_reads | 4 |
| Key_reads | 4 |
| Key_reads | 13 |
| Key_reads | 9 |
| Key_reads | 6 |
| Key_reads | 20 |
| Key_reads | 6 |
| Key_reads | 11 |

This server is doing approximately 7 Key_reads every ten seconds but it is hard to say that how many of them are random I/O and how many are just system calls (i.e. read from operrating system's cache)

Friday, 2 July 2010

Repair by keycache vs Repair by sorting

It is advised to disable keys on MYISAM/ARCHIVE tables before performing bulk insert operations should give a considerable speedup, especially when you have many indexes. And then enable keys to to re-create missing indexes, this task might take much longer than bulk insert operation, sometimes it may take days to finish.

One approach would be not to disable keys at first place but would this help at all? the answer is it may or may not. Personally I do not suggest this approach. If you find enable key task is taking longer then start new session with MySQL and check the state of enable keys command. If A SHOW PROCESSLIST reveal, not surprisingly, the dreaded "Repair with keycache", which seems to multiply indexing time by a factor of 20 to 30. If you want to see "repair by sort" to be used then theoretical max size of every single index must fit into myisam_max_sort_file_size.

Solution: increase myisam_max_sort_file_size

mysql> SET GLOBAL myisam_max_sort_file_size=100000 * 1024 * 1024

And restart enable keys

Similarly if your repair takes several hours then either use above solution or

you might think of using the utility 'myisamchk' with the -n (or --sort-recover) option.

Thursday, 25 February 2010

Backup using mysqlhotcopy script

I needed to setup a script to backup myisam tables, so I thought I shall publish it on web too

########################

# Create Backup Folder #

########################

NOW=$(date +"%m_%d_%Y")

BACKUP_LOCATION="/u02/backup/dev/$NOW/"

mkdir -p `echo $BACKUP_LOCATION`

cd /u02/backup/dev/$NOW

########################################

# Array of databases we want to backup #

########################################

ARRAY="bugs my_company knowledge_base mysql wikidb"

MYSQL_HOTCOPY="/usr/bin/mysqlhotcopy -p xxxxx -q "

echo "----BACKUP STARTED @`date`"

#################################################

# Iterate through Array and Backup each of them #

################################################

for DATABASE in `echo $ARRAY`

echo "Taking backup of $DATABASE.."

$MYSQL_HOTCOPY $DATABASE $BACKUP_LOCATION

echo "Done.."

echo "Compressing backup directory using tar"

/bin/tar cfP $DATABASE.tar $BACKUP_LOCATION/$DATABASE

echo "Done.."

echo "Delete backup directory after tar command"

rm -fR $DATABASE

echo "Done.."

echo "Compressing $DATABASE.tar file using gzip"

/bin/gzip $DATABASE.tar

echo "done"

done

echo " ----- END -----@`date` "

##################################

Tuesday, 31 March 2009

Obtain reads/writes information for the non-default key caches

While there isn't a simple SQL command to pass to the mysql client and obtain this information, you can use the following command to write the information to the mysql error log:

shell> mysqladmin debug

Everytime you execute this command a new entry will be added to the log file that includes information on the individual key caches. Here is a example of how to create a indepedent cache and the output of the debugging entry:

mysql> SET GLOBAL tcache.key_buffer_size=2048*1024;
Query OK, 0 rows affected (0.00 sec)

mysql> CACHE INDEX t IN tcache;
+--------+--------------------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------+--------------------+----------+----------+
| test.t | assign_to_keycache | status | OK |
+--------+--------------------+----------+----------+
1 row in set (0.01 sec)

======= error log =============
Key caches:
default
Buffer_size: 8384512
Block_size: 1024
Division_limit: 100
Age_limit: 300
blocks used: 3
not flushed: 0
w_requests: 5
writes: 1
r_requests: 15
reads: 3

tcache
Buffer_size: 2097152
Block_size: 1024
Division_limit: 100
Age_limit: 300
blocks used: 0
not flushed: 0
w_requests: 0
writes: 0
r_requests: 0
reads: 0

===============================

Now if we insert some entries into the table and read then and execute "mysqladmin debug" again, we can see the updated stats:

======= error log =============
mysql> INSERT INTO t (id) VALUES (0),(1),(2),(3),(4);
Query OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0
Key caches:
default
Buffer_size: 8384512
Block_size: 1024
Division_limit: 100
Age_limit: 300
blocks used: 3
not flushed: 0
w_requests: 5
writes: 1
r_requests: 15
reads: 3

tcache
Buffer_size: 2097152
Block_size: 1024
Division_limit: 100
Age_limit: 300
blocks used: 1
not flushed: 0
w_requests: 5
writes: 1
r_requests: 4
reads: 0
===============================

Database Administration

Saturday, 23 June 2012

MySQL - capturing diagnostic data

Quick Examples:

A) Collect diagnostic data when at least 40 threads are running,

B) Collect data when at least 20 queries running in the 'copying to tmp table' State.

You will probably like to run this tool as daemon - Here’s a sample configuration file for triggering when there are more than 40 threads running at once:

Wednesday, 18 April 2012

Binary log Synchronization

Tuesday, 9 November 2010

Short index length (good or bad)

Friday, 23 July 2010

Key cache hit ratio

Friday, 2 July 2010

Repair by keycache vs Repair by sorting

Thursday, 25 February 2010

Backup using mysqlhotcopy script

Tuesday, 31 March 2009

Obtain reads/writes information for the non-default key caches

About Me

Blog Archive