Friday, 2 July 2010

Repair by keycache vs Repair by sorting

It is advised to disable keys on MYISAM/ARCHIVE tables before performing bulk insert operations should give a considerable speedup, especially when you have many indexes. And then enable keys to to re-create missing indexes, this task might take much longer than bulk insert operation, sometimes it may take days to finish.

One approach would be not to disable keys at first place but would this help at all? the answer is it may or may not. Personally I do not suggest this approach. If you find enable key task is taking longer then start new session with MySQL and check the state of enable keys command. If A SHOW PROCESSLIST reveal, not surprisingly, the dreaded "Repair with keycache", which seems to multiply indexing time by a factor of 20 to 30. If you want to see "repair by sort" to be used then theoretical max size of every single index must fit into myisam_max_sort_file_size.

Solution: increase myisam_max_sort_file_size

mysql> SET GLOBAL myisam_max_sort_file_size=100000 * 1024 * 1024

And restart enable keys

Similarly if your repair takes several hours then either use above solution or
you might think of using the utility 'myisamchk' with the -n (or --sort-recover) option.

1 comment:

  1. One thing that isn't clear from this post... can I just kill the job that's in the "Repair with keycache" phase and just enable keys again after bumping myisam_max_sort_file_size? I'm doing a "TRUNCATE TABLE ... / LOAD DATA LOCAL INFILE ..." rather than an explicit DISABLE/ENABLE KEYS.

    Here is what the process list looks like:

    | 569712 | ens | ebi5-232.asdf:52625 | my_db | Query | 16690 | Repair with keycache | LOAD DATA LOCAL INFILE
    "MySQL/tmp_sample_genotype_single_bp.tsv" INTO TABLE
    (variation_id, allele_1, allele_2, sample_id) |

    i.e. if I kill this job, will it just kill the enable keys part or will it try to roll back the data it already loaded?