Thursday 24 July 2014

SSIS: A deadlock was detected while trying to lock variable

Occasionally, especially when using an SSIS framework with an orchestration table of some kind and event handling, you might come across “Error: A deadlock was detected while trying to lock variable "User::xxxxxxx" for read access. A lock could not be acquired after 16 attempts and timed out.”

What this means is that two processes are attempting to read/write to the same SSIS variable at the same time.

The usual workaround is to explicitly lock/release your variables in your script task or transformation as described here: http://toddmcdermid.blogspot.com/2010/08/use-readonlyvariables-and.html, but what if it’s the out of the box tasks (like the Execute SQL Task that uses parameters and/or saves a resultset to variables) that are locking? Does this mean you need to code your own version in .NET as a script task and handle the variable locking better?

Execute SQL Task and Event Handlers

Imagine setting up a metadata or queue driven ETL framework similar to this:
image
The “Get Id” task populates some package level variables for this specific ETL package. The work is done in the “Do Work” container, then the ETL run is flagged as complete in some orchestration table.

Now, imagine the “Get Id” task fails and your custom OnError package level event handler uses one of the variables referenced in the “Get Id” task .
image
You’ll be greeted with the following error after 30-120 seconds (depending on your timeout settings).
image
Put simply this happens because the variable is locked in the “Get Id” task and the OnError event is fired asynchronously and bubbled up to the package level event handler before the “Get Id” task is finished. The “Log Failure” task in the event handler cannot get access to the variable, and it fires an error.

While it is a problem, it’s not the root cause of the failure, and it could leave some poor application support guy chasing dead ends for a package that locks up and dies intermittently. The actual issue is that the “Get Id” task expects a single row result but the query isn’t returning anything. Wouldn’t it be nice if we could ignore this error from showing up and just focus on the root cause?

Break the Execution Chain

There is a way to break the execution chain. That is, to separate the “Get Id” task and prevent it from getting to the package level OnError handler while still causing your package to fail. This will allow the “Get Id” task to complete before the package level OnError event handler is fired.
  1. Create an OnError event handler on the deadlocking task (“Get Id”) and set the system Propagate flag to False
    image
    You can leave the OnError event handler empty or have a custom event for an error that occurs at the initiation step of your package. The Propagate flag set to false means an error on the “Get Id” task doesn’t get bubbled up.
  2. Set the FailParentOnFailure and FailPackageOnFailure to true for the “Get Id” task. Not doing this will result in your package either continuing to execute after the Get Id failure or reporting execution success at the package level.
    image
That’s it! Now when the package executes we get no deadlocks
image

Disclaimer

The above example is just a quick and simple mock up that shows the deadlocking in action, and probably breaks quite a few design and best practise rules. There are multiple ways to code defensively to prevent a null result where a row is expected from breaking your package, but that’s a story for another day!

edit 31 July 2014: The above method won't fire the Start container's and package level OnError event handlers (since it's not being propagated). One way around this is to place a script task in the Get Id's OnError handler and just change the Dts.TaskResult = (int)ScriptResults.Success to Dts.TaskResult = (int)ScriptResults.Failure in the script. This script task will fail and propagate up to your higher level OnError handlers.

No comments:

Post a Comment