Tuesday, April 27, 2010

Mysql Replication Slave Stopped Due to “Lock Wait Timeout Exceeded” Error


MySQL Replication: [ERROR] Slave: : Error 'Lock wait timeout exceeded; try restarting transaction' on query.

In a deadlock situation, the Mysql Replication server slave would let the transaction to retry up to the number of time which was set for the slave_transaction_retries system variable before stopping the slave and logging a message like the one bellow in the error log:

100419 0:03:47 [ERROR] Slave: : Error 'Lock wait timeout exceeded; try restarting transaction' on query.   Default database: 'mysqldb'. Query: 'UPDATE a_table SET colum_1 =a_value_colum + 1000,…,… WHERE id='12345'', Error_code: 1205
100419 0:03:47 [Note] Slave SQL thread exiting, replication stopped in log 'slave-1-log-bin.006034' at position 10470968

Refering to the MySQL manual; the 1205 error has the following description:

Error: 1205 SQLSTATE: HY000 (ER_LOCK_WAIT_TIMEOUT)
Message: Lock wait timeout exceeded; try restarting transaction.

Since the tables involved in the locks were InnoDB tables, we could check the InnoDB status by issuing the following command:
Mysql> show engine innodb status\G;
---------------------------------------------
LATEST DETECTED DEADLOCK
---------------------------------------------
*** (1) TRANSACTION:
TRANSACTION 0 2513168731, ACTIVE 0 sec, process no 17876, OS thread id 685890464 starting index read
mysql tables in use 2, locked 2
LOCK WAIT 5 lock struct(s), heap size 320, undo log entries 1
MySQL thread id 8018, query id 317267495 user_id Updating
...
There were some useful information regarding the lock from the above output under the LATEST DETECTED DEADLOCK section which details the causes of the deadlock such as what kind of locks were used, and the query statements that involved in the lock.

To resolve the deadlock issues, we may need to debug the query statements and make code changes. In the mean time, we can check innodb_lock_wait_timeout and slave_transaction_retries to see if we can increase it a bit to avoid stopping the slave. To do so, issue the following statement from the mysql client:

Mysql> show variables;
innodb_lock_wait_timeout | 150

slave_transaction_retries | 10

The innodb_lock_wait_timeout variable set the length of time a transaction will wait for a row lock resource before “giving up”. The default value is 50 seconds. Please note that this variable applies to InnoDB row locks only. A MySQL table lock does not happen inside InnoDB and this timeout does not apply to waits for table locks. InnoDB does detect transaction deadlocks in its own lock table immediately and rolls back one transaction. The lock wait timeout value does not apply to such a wait.

The slave_transaction_retries set the number of time the replication slave thread to retry before stopping with an error due to any of the following situation (the default for this variable is 10):

· an InnoDB deadlock
· the transaction's execution time exceeded InnoDB's innodb_lock_wait_timeout
· the transaction’s execution time exceeded NDBCLUSTER’s TransactionDeadlockDetectionTimeout
· the transaction’s execution time exceeded NDBCLUSTER's TransactionInactiveTimeout,

Please also notice that, if the slave_transaction_retries server variable is set, the transaction will hold the lock for (slave_transaction_retries * innodb_lock_wait_timeout) seconds before giving up. In the example above, it means that a lock will be held for proximately 1,500 seconds (10*150). Therefore, try to find out what is causing this lock and fix it maybe a better solution.

In any case, please remember to restart the slave because deadlock error stopped the slave as it had indicated in the error log.

Reference:

1 comment:

Anonymous said...

I would like to exchange links with your site www.blogger.com
Is this possible?