Avoid SQL Deadlocks -- Break Up Large Updates

Deadlocks in SQL occur when one query locks certain rows, frequently for updates, and a second query tries to update those same rows. The second query will then create an error as those rows are unable to be updated since they are in the middle of an update from another query. One of the surefire ways to create a slow running update query like the first query above is to hava a single update statement that will update a large number of rows at once.

I have seen a table that has over 1.5 million rows that frequently had deadlocks with update statements. The query that it seems was the cause of these deadlocks was an update statment that updated over 250,000 rows all at once, taking 30 seconds or more to execute and running at least once an hour. While the schema of this table is in need of a major update, it must work in its current state without opening these large windows for deadlocks to occur.

The update query, unfortunately, must run regularly to update some constantly changing data. This query does a few joins to other tables, and as a result, the version of MySQL that it is hosted on does not support limiting the update statement when it does joins. As a result, in order to break the update statement into chunks as small as 1000 rows at a time, you must first stage the data in a temporary storage location. Once the updated data is consolidated into the staging table, you can then perform the insert into the large table at a small chunk of rows at a time. While preparing the staging table still takes about 30 seconds, this does not expose the server to deadlocks. The actual inserts from the staging table, if performed all at once takes only about 4 seconds. If we break it into chunks of 1000 rows at a time, the overall time for the update is still around 4 seconds, but the individual updates only take a couple hundredths of a second to complete, making it extremely unlikely that a deadlock will occur.

In the end, I was able to perform the update of over 250,000 rows in about 4 seconds without much chance of instigating a deadlock that previously was almost guaranteed to occur.

Related Posts

Apr 9, 2014
One minute

SQL Server Transaction Log Exponential Growth

There are few things more frustrating than seemingly random issues that crop up in software when configuration changes occur. One such occurrence is when you migrate your databases from Microsoft SQL Server 2012 Standard Edition to Microsoft SQL Server 2012 Enterprise Edition with High Availability and the transaction log suddenly begins to experience exponential growth without ceasing.

It turns out that when using Python and pyodbc on Windows to access SQL Server, there can be some unpredictable results. If you have a long-running SQL query that you are running from Python and pyodbc, when you are running it against a Microsoft SQL Server 2012 Standard Edition database, it will fail and time out silently, making Python think that the query succeeded. On the other hand, if you run the same long-running SQL query from Python and pyodbc in Microsoft SQL Server 2012 Enterprise Edition with High Availability, it will fail and rollback the query, but will fill the transaction log.

Apr 10, 2014
2 minutes

Unexpected Results with SQL Server and Python pyodbc

Using the Microsoft SQL Server Management Studio (SSMS) with SQL Server hides many of the API complications that can sometimes arise when working with SQL Server. One specific example would be when using Python on Windows with the pyodbc driver. If you have an update statement that performs a simple update to a status column and a datetime column, you can have some unexpected results.

Lets say that the table you are running the update against has a before update trigger and an after update trigger configured on it. Both triggers effectively do the same thing, as they log the current affected row to a second, logging table, peforming separate insert statements to do so. When running this update statement in SSMS, it seems to behave as you would expect, with a single result set returned, but listing three sets of (1 row updated) for every row that was updated. When using Python’s pyodbc driver to run this exact same SQL update statement, it shows that only 1 row was updated when there should have been many updated.

Apr 17, 2014
2 minutes

MySQL Deadlocks with Magento

One of the things that Magento, and specifically the Zend Framework provide developers is the ability to not have to think about database details as it should just handle all that for you. When it becomes obvious that there is a problem somehow with the production database getting some sort of SQL errors, its time for the developers to start caring about the implementation and architecture details of the database.