How one small change caused the CPU to go crazy
Finding the cause of very high CPU usage on your SQL Server may not be as easy as running scripts against a few DMV's, looking at perfmon or the activity on your server. We found this out recently when a SQL Server had its CPU at 100% over an extended period of time. Usually the CPU's average 30% utilisation at best. We asked what had changed and got told "Nothing". We looked at the usual suspects to try and identify the root cause of the CPU usage. The usual queries against DMVs showed nothing exceptional, perfmon did not highlight a particular SPID, who2 showed nothing and the job activity monitor showed no long running or even running jobs.
It looked as though it was a DDOS attack but there was nothing to indicate this. The only clue was when we ran a profiler trace we noticed a stored procedure that was being executed a very large amount of times and each execution was taking approx. 6000 ms. When we traced this back we found that this stored procedure was started by a SQL Server Agent job that called the command line windows script host (cscript) to execute the stored procedure. So each execution of the stored procedure was within its own SPID.
When we questioned about this stored procedure, we found that the stored procedure had been changed and it was just missing an index. When the index was added, the CPU dropped.
So the root cause was a change to a very heavily used stored procedure (which we weren't told about), that was invoked in an uncommon method (so it was difficult to trace) and a missing index that caused the CPU to go to 100%.