Solving PAGELATCH_EX and PAGELATCH_SH for SQL Server
(in Database, Interesting, SQL Server on Monday, February 21st, 2011 by Andrew)Chances are if you’re experiencing one of these two lock wait types you have a very high transaction application. In our case it’s true to the extent that we have greater then 3x the amount of connection resets on our connection pool then any other application using SQL Server in the firm (on just one instance about 178k connections per minute).
In our few extremely high transaction tables we wind up running into an issue with disposing of stale data. To solve this we partition the tables by day of year (and in some cases hour of day). Every day (or hour) we swap out the partition to a blank table and truncate that swap table for a sub second data cleanup. And because of this we’ve both partitioned and sorted our cluster index by this field.
A PAGELATCH_EX or PAGELATCH_SH alert is actually not related to IO yet you’ll run into it with the scenario I explained above. What winds up happening is that while we have a pretty beefy data server, we have contention on the last page in the table only allowing us to process a few inserts at a time. This is because our clustered index is sorting on day of year ascending which creates a hot spot at the end of the table.
In order to solve this, all we had to do is force our inserts to hit multiple parts of the table at once to prevent a single page lock from slowing down the entire system. In this part of our application we were collecting metrics data from all workstations in the firm and we had a column MachineName for every row in the table. We decided to add MachineName to the clustered index which then forced each insert to each hit a unique page in our table.
From then on, you’re sitting on the speed of your disk IO. Provided you have a decent storage setup you’ll be leaning on your log volume. The rest is pretty well documented in Google, I just didn’t find anything that easy for PAGELATCH issues so I figured I would share.
