Wednesday, 1 April 2009

Concurrency, transactions and isolation levels

Concurrency, transactions and isolation levels are a complicated matter so I understand it's initially difficult to grasp the concepts.

Choosing an isolation level is basically a trade off between the amount of concurency and obtaining "the correct and expected" data. Higher isolation levels (like SERIALIZABLE) lower the amount of concurrency because one transaction will depend on the outcome of the other and therefore can not run at the same time (more locking is needed). However, this insures that results are "more correct" as other transactions can't modify it.

It's difficult to explain these concepts so let me try it with an example. Say you are at an ATM machine and want to withdraw some money. You ask the machine how much money you have and it says $500. So you say, ok, continue and withdrawn $500 dollars. The ATM will do:

withdraw($500) if balance >= $500

Just before you ask the ATM to withdrawn (but after you've seen your balance), your telephone company tries to withdraw $100 to pay for your phone bill.

With READ COMMITTED, the ATM will read your balance, which is now $500-$100=$400 and refuse to give you the $500 (it read the committed value and now knows you only have $400), even though it just told you you had $500.

With REPEATABLE READ, the ATM will read your balance, but it is guarantee to read the same value, i.e., $500 and will give you the $500 dollars. You may ask, what happens to the $100 withdraw from the phone company? It depends on implementation, but usually the phone bill transaction will try to do the update, find a lock and wait until the ATM has finished. That is why higher isolation levels have lower concurrency but more "accurate" data, as them make transactions more serializable.

No comments:

Post a Comment