Database Administration

Wednesday, 18 April 2012

Binary log Synchronization

MySQL sync_binlog option variable is used to control how often binary log may be synchronized to disk. By default, it is not synchronized to disk at each write. So if the operating system or machine (not only the MySQL server) crashes, there is a chance that the last statements of the binary log are lost. To prevent this, you can make the binary log be synchronized to disk after every N writes to the binary log, with the sync_binlog system variable. With value of 1 - every single statement is written to disk, it means that the performance suffers, usually a lot unless the disk has a battery-backed cache enabled, which makes synchronization fast.

I performed some load tests to see how it may hurt database performance when the battery-backed disk cache is missing:

Total 40,000 records inserted into table using 4 parallel db connections:

sync_binlog	time
0	0m17.972s
1	1m4.599s
2	0m44.364s
3	0m34.197s
4	0m18.693s

sync_binlog with value of 3 simply means: only fsync binlog every third transaction.
The higher you increase the value of sync_binlog, the closer the results will be to sync_binlog=0 (no fsyncing).

Value of 1 is safest of course, but as you see, without decent battery backed write caching controller, performance is very bad.

Test case :

Tools used:

-- MySQL 5.1.56
-- mysqlslap
-- platform -
Red Hat 4.1.2-50 x86_64
, RAID 5 (8 disks, 67.75 GB each, Vendor DELL)
, Intel(R) Xeon(R) CPU 5148 @ 2.33GHz, 4 CPUs
, 8G RAM

# Connect to MySQL client and perform:

drop database if exists test_case_9;
create database test_case_9;
use test_case_9;
CREATE TABLE t (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`content` blob,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=800001 DEFAULT CHARSET=utf8;

drop function if exists insert_data;
delimiter //
CREATE DEFINER=`root`@`localhost` FUNCTION `insert_data`() RETURNS int(11)
DETERMINISTIC
BEGIN
INSERT INTO t(id,content) values (null,REPEAT('bla bla',5000));
RETURN ROW_COUNT();
END
//
delimiter ;

-- Change sync_binlog to value 0
mysql> set global sync_binlog=0;

mysql> quit

# Run following steps from a shell prompt

-- load data using mysqlslap tool
$ time mysqlslap --user=root --query="select test_case_9.insert_data();" --concurrency=4 --iterations=10000;
Benchmark
        Average number of seconds to run all queries: 0.001 seconds
        Minimum number of seconds to run all queries: 0.001 seconds
        Maximum number of seconds to run all queries: 0.229 seconds
        Number of clients running queries: 4
        Average number of queries per client: 1

real    0m17.972s
user    0m1.306s
sys     0m3.096s

A) -- Change sync_binlog to value 1 and empty test table 't'

$ mysql -uroot -S /mnt/database/mysql_5.1/mysql.sock test_case_9 -e"truncate table t; set global sync_binlog=1;"

B) -- load data using mysqlslap tool

$ time mysqlslap --user=root --query="select test_case_9.insert_data();" --concurrency=4 --iterations=10000;
Benchmark
        Average number of seconds to run all queries: 0.006 seconds
        Minimum number of seconds to run all queries: 0.003 seconds
        Maximum number of seconds to run all queries: 0.018 seconds
        Number of clients running queries: 4
        Average number of queries per client: 1

real    1m4.239s
user    0m1.672s
sys     0m4.203s

Repeat steps A and B to run load tests using Value 2, 3, and 4 with synch_binlog

Tuesday, 6 March 2012

MySQL - Event Scheduler

MySQL Events are tasks that run according to a schedule. Therefore, we sometimes refer to them as scheduled events. Basically an event is a named database object containing one or more SQL statements to be executed at one or more regular intervals, beginning and ending at a specific date and time. Conceptually, this is similar to the idea of the Unix crontab (also known as a “cron job”) or the Windows Task Scheduler.
Here is the summary of Event object features:
- An event is uniquely identified by its name and the schema to which it is assigned
- An event performs a specific action according to a schedule. This action consists of an SQL statement, which can be a compound statement in BEGIN .. END;
- An event's timing can be either one-time or recurrent. A one-time event executes one time only. A recurrent event repeats its action at a regular interval.
For further details, visit :http://dev.mysql.com/doc/refman/5.1/en/events-overview.html
Let's create an Event object to increment 'counter' column of event_test table every minute:

mysql> CREATE EVENT my_event
ON SCHEDULE EVERY 1 MINUTE
DO UPDATE event_test set counter=counter+1;
Query OK, 0 rows affected (0.00 sec)

Metadata about events can be obtained as follows:

mysql> SHOW CREATE EVENT my_event\G
*************************** 1. row ***************************
               Event: my_event
            sql_mode:
           time_zone: SYSTEM
        Create Event: CREATE EVENT `my_event` 
ON SCHEDULE EVERY 1 MINUTE STARTS '2012-03-06 16:46:01' 
ON COMPLETION NOT PRESERVE ENABLE 
DO update event_test set counter=counter+1
character_set_client: latin1
collation_connection: latin1_swedish_ci
  Database Collation: latin1_swedish_ci
1 row in set (0.00 sec)

Since this is a repeating event and may not terminate within its scheduling interval, the result may be multiple instances of the event executing simultaneously. Which is undesirable, use GET_LOCK() function or LOCK TABLE to avoid this.

Check if the 'my_event' has actually worked

mysql> SELECT SLEEP(60); SELECT * FROM event_test;
+-----------+
| SLEEP(60) |
+-----------+
|         0 |
+-----------+
1 row in set (1 min 0.01 sec)

+---------+
| counter |
+---------+
|       0 |
+---------+
1 row in set (0.00 sec)

No it has not worked, because event scheduler thread is not ON. The global event_scheduler system variable determines whether the Event Scheduler is enabled and running on the server.

mysql> select @@GLOBAL.event_scheduler ;
+--------------------------+
| @@GLOBAL.event_scheduler |
+--------------------------+
| OFF                      |
+--------------------------+
1 row in set (0.00 sec)

To enable Event Scheduler use this sql:

mysql> set GLOBAL event_scheduler =ON;
Query OK, 0 rows affected (0.00 sec)

For more information about event scheduler, see http://dev.mysql.com/doc/refman/5.1/en/events-configuration.html
When the Event Scheduler is ON, the event scheduler thread is listed in the output of SHOW PROCESSLIST as a daemon process, and its state is represented as shown here:

mysql> SHOW PROCESSLIST;

| 43 | event_scheduler | localhost | NULL | Daemon  |   35 |
 Waiting for next activation | NULL             |
+----+-----------------+-----------+------+---------+------+

Lets check the status of 'counter' after truning event Scheduler ON

mysql> SELECT SLEEP(60); SELECT * FROM event_test;
+-----------+
| SLEEP(60) |
+-----------+
|         0 |
+-----------+
1 row in set (1 min 0.00 sec)

+---------+
| counter |
+---------+
|       1 |
+---------+
1 row in set (0.00 sec)

mysql> SELECT SLEEP(60); SELECT * FROM event_test;
+-----------+
| SLEEP(60) |
+-----------+
|         0 |
+-----------+
1 row in set (1 min 0.01 sec)

+---------+
| counter |
+---------+
|       2 |
+---------+
1 row in set (0.00 sec)

To disable my_event, use this ALTER EVENT statement:

ALTER EVENT my_event
    DISABLE;

It was just a simple example to show,

that how to automate database task inside MySQL server.

Wednesday, 25 January 2012

How to check if Innodb log files are big enough

InnoDB uses log files to store changes that must be applied to the database after service interruption (e.g power outage, crash). Thus, It is important for good performance that the Innodb log files are big enough.

In this example, I would demonstrate how to check the amount of InnoDB log file space in use, follow these steps ( checking current usage at peak times):
Examine INNODB MONITOR output (i.e. SHOW ENGINE INNODB STATUS\G) and look at these lines (if you are using 5.0 or 5.1 without Innodb plugin):

---
LOG
---
Log sequence number 2380 1505869110
Log flushed up to   2380 1505868967
Last checkpoint at  2380 936944426
1 pending log writes, 0 pending chkp writes
532415047 log i/o's done, 544.17 log i/o's/second

Perfrom following calulation (formula) to know log space used. The values to use in calculation are "Log sequence number" and "Last checkpoint at".

select (( ( 2380 *  4 * 1024 * 1024 * 1024) + 1505869110 ) - ( ( 2380 *  4 * 1024 * 1024 * 1024) + 936944426 )) /1024/1024 "Space used in MB";
+------------------+
| Space used in MB |
+------------------+
|     542.56885910 |
+------------------+
1 row in set (0.00 sec)

In this example 542 megabytes is more than 75% of the total log space, so the logsize (512MB) is small. Ensure that the amount of log space used never exceeds 75% of that value (542MB). Find the instructions here to add/resizeInnodb log files.

There is slightly different method to calculate innodb log file space used (if you are using MySQL 5.5 or InnoDB plugin in MySQL 5.1): Examine INNODB MONITOR output (i.e. SHOW ENGINE INNODB STATUS\G) and look at these lines:

---
LOG
---
Log sequence number 2388016708
Log flushed up to   2388016690
Last checkpoint at  2380597012

Perfrom following calulation (formula) to know log space used. The values to use in calculation are "Log sequence number" and "Last checkpoint at".

SELECT (2388016708 - 2380597012)/1024/1024 "Space used in MB";
+------------------+
| Space used in MB |
+------------------+
|     7.0759735111 |
+------------------+
1 row in set (0.00 sec)

In this example 7MB is not more than 75% of the total log space. As of MySQL 5.5, recovery times have been greatly improved and the whole log file flushing algorithm has been improved. In 5.5 you generally want larger log files as recovery is improved. Thus, the large innodb log files allow you to take advantage of the new algorithm.

Monday, 23 January 2012

MySQL Server Tuning

MySQL server tuning is important; if you mainly use Innodb tables then you need to check how well your Innodb buffer pool is sized. InnoDB use it to cache data and indexes. The larger you set this value, the less disk I/O is needed to access data in tables. click here for more detail about innodb buffer pool

You can examine Innnodb buffer pool efficiency by looking at STATUS variables:


mysql> SHOW GLOBAL STATUS LIKE 'innodb_buffer_pool_rea%';
Innodb_buffer_pool_read_requests  | 4519597979  |

Innodb_buffer_pool_reads | 55253 |

Innodb_buffer_pool_read_requests are number of request to read a row from the buffer pool and Innodb_buffer_pool_reads is the number of times Innodb has to perform read data from disk to fetch required data pages.

So innodb_buffer_pool_reads/innodb_buffer_pool_read_requests*100= 0.001 is the efficiency. Thus, we see the vast majority of times INNODB is able to statisfy requests from memory, it's pretty normal for databases to have hot spots in which you're accessing only a portion of the data the majority of the time.

Let's look at this example,


mysql> SHOW GLOBAL STATUS LIKE 'innodb_buffer_pool_rea%';
Innodb_buffer_pool_read_requests  | 2905072850 |

Innodb_buffer_pool_reads | 1073291394 |

Calculate Innodb buffer pool efficiency:
(107329139/ 2905072850*100) = 37

Here the Innodb is doing more disk reads, Innodb buffer pool is not big enough!

Another way to examine innodb efficiency is to examine SHOW ENGINE INNODB STATUS output:


----------------------

BUFFER POOL AND MEMORY

----------------------

Total memory allocated 1838823262; in additional pool allocated 10550784

Buffer pool size   102400


Free buffers 0

Database pages     101424

Modified db pages  24189

Pending reads 0

Pending writes: LRU 0, flush list 0, single page 0

Pages read 2050214912, created 313658031, written 755828632

36.95 reads/s, 12.12 creates/s, 7.05 writes/s



Buffer pool hit rate 993 / 1000

If there are numerous Innodb free buffers, you may need to reduce Innodb buffer pool.

Innodb hit ratio: 1000/1000 identifies a 100% hit rate

Slightly lower hit rate values may acceptable.

If you find Innodb hit ratio is less than 95% then you may need to increase Innodb buffer pool size.

Note: On a dedicated database server, you may set innodb_buffer_pool_size up to 80% of the machine physical memory size. However, do not set it too large because competition for physical memory might cause paging in the operating system.

Database size is usually much bigger than available RAM, so most of the time it's no feasible to have buffer pool size equals to database size. Luckily, innodb creates hot spots inside buffer pool, that is, most frequently accessed data pages stay in memory but if your application performs a lot of random disk I/Os and your database is too large to fit in memory and only small percentage of data pages can be cached in innodb buffer pool, then adding more RAM is the best solution for random-read I/O problem.

If you still get poor Buffer pool hit rate, enable slow query log to capture bad queries that perform table scans and hence blow out buffer pool cache.

Tuesday, 9 August 2011

INNODB LOCKING REGRESSION FOR INSERT IGNORE

Our application attempts to INSERT IGNORE the same row of data from many different connections to the same InnoDB table. In our test runs, we noticed that the 5.0 code created S locks while the 5.1 code created X locks for the same set of actions.

In the code paths for 5.0 and below, INSERT IGNORE processing would take
S-locks on any duplicate rows. The X-lock was reserved for only new rows
added to the table.

In 5.1 the locking logic was rewritten and all rows touched by INSERT IGNORE
are X-locked instead. This can turn a parallel data merge process into
essentially a single-threaded process because the connections that are not
actually adding a row to the table must wait for their X-lock which requires
the termination of the other locking thread(s).

the fault can be traced to this logic (5.1+):

if (allow_duplicates) {

/* If the SQL-query will update or replace
duplicate key we will take X-lock for
duplicates ( REPLACE, LOAD DATAFILE REPLACE,
INSERT ON DUPLICATE KEY UPDATE). */

err = row_ins_set_exclusive_rec_lock(
LOCK_ORDINARY, rec, index, offsets, thr);

This means that all INSERT IGNORE will take an X-lock because earlier this
flag was set:

case HA_EXTRA_IGNORE_DUP_KEY:
thd_to_trx(ha_thd())->duplicates |= TRX_DUP_IGNORE;
break;

This defect has been reported to MySQL.

Thursday, 25 November 2010

Query Tuning (Query optimization) Part-1

In this artical I'll try to explain query tuning techniques, to start with we'll need to capture slow queries, that is, log queries executing longer than long_query_time server variable (in seconds but supports microseconds when logging to file). The slow query log help identify candidates for query optimization.

Assumptions:

1) The reader has basic MySQL/Unix skills.

1) Enable slow query log:
Add following lines into MySQL option file (i.e. /etc/my.cnf) under [mysqld] section

log-slow-queries=mysql_slow.log
long-query-time=2

Note: If you specify no name for the slow query log file, the default name is host_name-slow.log.

And restart MySQL server when it's safe to do so e.g. /etc/init.d/mysql restart

For further details on slow query log, please visit here:
# if using 5.1
http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html
# if using 5.0
http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html

If you are using MySQL 5.1, you can enable slow log this way:

mysql> set global long_query_time = 2;
mysql> set global slow_query_log = 1;
mysql> set global slow_query_log_file = 'mysql_slow.log‘;

2) Process Slow Query Log:
You can process the whole slow log file and find most frequent slow queries using mysqldumpslow utility

# Find top 10 slowest queries

$ mysqldumpslow -s t -n 10 /path/to/mysql_slow.log >mysql_slow.log.c
logReading mysql slow query log from /path/to/mysql_slow.log

Count: 1 Time=1148.99s (1148s) Lock=0.00s (0s) Rows=0.0 (0),
insert into a select * from b
Count: 37 Time=2.28s (84s) Lock=0.11s (4s) Rows=0.0 (0),
Update a set CONTENT_BINARY = 'S' where ID = 3874
Count: 1 Time=29.31s (29s) Lock=0.00s (0s) Rows=0.0 (0), select max(LOCK_VERSION) from b
...

For further details on using mysqldumpslow please visit here:
http://mysqlopt.blogspot.com/search?q=mysqldumpslow
http://dev.mysql.com/doc/refman/5.1/en/mysqldumpslow.html

3) Using EXPLAIN to Analyze slow queries:
this will provide us find
a) Whether optimizer is using existing idnexes
b) Can help qualify query rewrites
c) Points out need to index

4) Main query performance issues:
a) Full table scans
b) Temporary tables
c) Filesort

5) Let's start with basics, that is, 'Full table scans' situations

mysql> EXPLAIN select * from tblmeshort where timestamp between '2010-08-17 00:00:00' and '2010-08-17 01:20:00'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tblmeshort

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL
ref: NULL

rows: 400336

Extra: Using where
1 row in set (0.00 sec)

The EXPLAIN ouput suggests that optimizer will do 'FULL TABLE SCAN', as indicated by type: ALL.

The solution is to add an index on `timestamp` column, as we are filtering rows using this column.

mysql> alter table tblmeshort add index (`timestamp`);
Query OK, 0 rows affected (1.84 sec)
Records: 0 Duplicates: 0 Warnings: 0

Let's re-run EXPLAIN on the same query

mysql> EXPLAIN select * from tblmeshort where timestamp between '2010-08-17 00:00:00' and '2010-08-17 01:20:00'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tblmetricshort

type: range

possible_keys: timestamp

key: timestamp

key_len: 4

ref: NULL

rows: 1

Extra: Using where

1 row in set (0.00 sec)

Now let's remove this index and look at another example:

mysql> alter table tblmehort drop index `timestamp`;
Query OK, 0 rows affected (0.03 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> EXPLAIN SELECT * FROM tblmeshort
INNER JOIN tblMetric
ON (tblmehort.fkMetric_ID=tblMetric.pkMetric_ID)
WHERE timestamp between '2010-08-17 00:00:00' and '2010-08-17 01:20:00'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tblmeshort
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 400336
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: tblMetric

type: eq_ref

possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: metrics.tblmehort.fkMetric_ID
rows: 1
Extra: Using where
2 rows in set (0.00 sec)

In the above example, the optimizer will perform full table scan on tblmeshort first and when MySQL goes looking for rows in tblMetric, instead of table scanning like it did before, it will use the value of fkMetric_ID with the 'PRIMARY KEY' of tblMetric table to directly fetch matching rows from tblMetric. Thus this SQL is partially optimized, that is, it does scan all rows of tblmeshort but it uses index to join tables.

The solution is to add an index on `timestamp` column and re-run EXPLAIN with same query

mysql> EXPLAIN SELECT * FROM tblmetricshort INNER JOIN tblMetric ON (tblmetricshort.fkMetric_ID=tblMetric.pkMetric_ID) WHERE timestamp between '2010-08-17 00:00:00' and '2010-08-17 01:20:00'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tblmetricshort

type: range

possible_keys: timestamp

key: timestamp

key_len: 4
ref: NULL
rows: 1
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: tblMetric

type: eq_ref

possible_keys: PRIMARY

key: PRIMARY

key_len: 3
ref: metrics.tblmetricshort.fkMetric_ID
rows: 1
Extra: Using where
2 rows in set (0.00 sec)

Continue....

Tuesday, 9 November 2010

Short index length (good or bad)

If you find the work load on DB server is disk bound and you do not have enough memory to increase innodb buffer pool size. We can help improve the performance by reducing the length of indexes, this will result more indexes fit into memory and would increase write operations. It may impact SELECT queries but not necessarly for bad. Allow me to show some example. I will lower the index length of my test table from 40 char to just 8 char:
Note: You may consider trying different lengths for your index and check which best fit your workload (both writing and reading speed).

ALTER TABLE csc52021 DROP INDEX value_40, ADD INDEX value_8(value(8));

To look all the rows that have a value between 'aaa' and 'b':

mysql> EXPLAIN SELECT COUNT(*) FROM csc52021 WHERE value&gt;'aaa%' AND
mysql> <'b';
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| 1  | SIMPLE      | csc52021 | range | value_8   | value_8     | 11      | NULL | 9996 | Using where |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM csc52021 WHERE value&gt;'aaa%' AND value&lt;'b';
+----------+
| COUNT(*) |
+----------+
| 5533     |
+----------+
1 row in set (0.02 sec)

As you can see, the index is used.
What if look for a specific value?

mysql> EXPLAIN SELECT COUNT(*) FROM csc52021 WHERE
mysql> value='a816d6ce93c2aa992829f7d0d9357db896d5e7de';
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table    | type | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
| 1  | SIMPLE      | csc52021 | ref  | value_8       | value_8 | 11      | const | 1    | Using where |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM csc52021 WHERE
mysql> value='a816d6ce93c2aa992829f7d0d9357db896d5e7de';
+----------+
| COUNT(*) |
+----------+
| 1        |
+----------+
1 row in set (0.00 sec)

SELECT statements does a lookup for a value that have very low cardinality (for example, if we had millions of rows that starts with the same first 16 char "a816d6ce93c2aa99"), then a short index is not very efficient.
On the other hand, a short index length allows to fit more indexes value in memory, increasing the lookup speed.

In my test table, I can decrease the index length to 4 and still have good performance. For example:

mysql> ALTER TABLE csc52021 DROP INDEX value_8, ADD INDEX
mysql> value_4(value(4));

mysql> EXPLAIN SELECT COUNT(*) FROM csc52021 WHERE
mysql> value='a816d6ce93c2aa992829f7d0d9357db896d5e7de';
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table    | type | possible_keys | key     | key_len | ref   | rows | Extra |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
| 1  | SIMPLE      | csc52021 | ref  | value_4       | value_4 | 7       | const | 2    | Using where |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)

mysql&gt; EXPLAIN SELECT COUNT(*) FROM csc52021 WHERE value LIKE 'a816%';
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| 1  | SIMPLE      | csc52021 | range | value_4       | value_4 | 7       | NULL | 2    | Using where |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql&gt; SELECT COUNT(*) FROM csc52021 WHERE value LIKE 'a816%';
+----------+
| COUNT(*) |
+----------+
| 2        |
+----------+
1 row in set (0.00 sec)

To summarize:
if you have good cardinality, a shorter index length will increase both writes and reads operations, especially if you are IO bound.