Avoid SQL Deadlocks -- Break Up Large Updates

Deadlocks in SQL occur when one query locks certain rows, frequently for updates, and a second query tries to update those same rows. The second query will then create an error as those rows are unable to be updated since they are in the middle of an update from another query. One of the surefire ways to create a slow running update query like the first query above is to hava a single update statement that will update a large number of rows at once.

I have seen a table that has over 1.5 million rows that frequently had deadlocks with update statements. The query that it seems was the cause of these deadlocks was an update statment that updated over 250,000 rows all at once, taking 30 seconds or more to execute and running at least once an hour. While the schema of this table is in need of a major update, it must work in its current state without opening these large windows for deadlocks to occur.

The update query, unfortunately, must run regularly to update some constantly changing data. This query does a few joins to other tables, and as a result, the version of MySQL that it is hosted on does not support limiting the update statement when it does joins. As a result, in order to break the update statement into chunks as small as 1000 rows at a time, you must first stage the data in a temporary storage location. Once the updated data is consolidated into the staging table, you can then perform the insert into the large table at a small chunk of rows at a time. While preparing the staging table still takes about 30 seconds, this does not expose the server to deadlocks. The actual inserts from the staging table, if performed all at once takes only about 4 seconds. If we break it into chunks of 1000 rows at a time, the overall time for the update is still around 4 seconds, but the individual updates only take a couple hundredths of a second to complete, making it extremely unlikely that a deadlock will occur.

In the end, I was able to perform the update of over 250,000 rows in about 4 seconds without much chance of instigating a deadlock that previously was almost guaranteed to occur.

Related Posts

Apr 11, 2014
2 minutes

How Not to Use SQL Transactions

SQL Transactions allow you to isolate atomic operations so that you can ensure that a third party does not update the data affected during the atomic operation protected by the transaction. An example of an operation that you would want to protect with a SQL Transaction would be transferring funds from one bank account to another. The first step of this operation would be to subtract the funds from bank account A. Once complete, we would then add the same amount of funds to bank account B. Assuming nothing fails, everything works as expected. However, if there is other database activity at the same time or an error occurs in one of the queries, without a transaction you could have the funds removed from bank account A or added to bank account B, but not both, causing a major balancing issue with your bank accounts.

Apr 17, 2014
2 minutes

MySQL Deadlocks with Magento

One of the things that Magento, and specifically the Zend Framework provide developers is the ability to not have to think about database details as it should just handle all that for you. When it becomes obvious that there is a problem somehow with the production database getting some sort of SQL errors, its time for the developers to start caring about the implementation and architecture details of the database.

Apr 9, 2014
One minute

SQL Server Transaction Log Exponential Growth

There are few things more frustrating than seemingly random issues that crop up in software when configuration changes occur. One such occurrence is when you migrate your databases from Microsoft SQL Server 2012 Standard Edition to Microsoft SQL Server 2012 Enterprise Edition with High Availability and the transaction log suddenly begins to experience exponential growth without ceasing.

It turns out that when using Python and pyodbc on Windows to access SQL Server, there can be some unpredictable results. If you have a long-running SQL query that you are running from Python and pyodbc, when you are running it against a Microsoft SQL Server 2012 Standard Edition database, it will fail and time out silently, making Python think that the query succeeded. On the other hand, if you run the same long-running SQL query from Python and pyodbc in Microsoft SQL Server 2012 Enterprise Edition with High Availability, it will fail and rollback the query, but will fill the transaction log.