SQL Server Maintenance Plans (Part 2)
25 October 2012, by Mike McLean
Welcome to the second part of my post on SQL Server Maintenance Plans.
Last time, we looked at the Control Flow UI, and wrapped our heads around how to get the Plan to do the actions in the sequence that we want. This time, I’ll show you a GOTCHA relating to making those actions do what you want. Specifically, about …
Executing Agent Jobs from Maintenance Plans
Disclaimer: The following comes from SQL Server 2005 and may not apply to later versions, although I looked briefly at SQL 2008 and it appeared to be set up the same way.
As you may have noticed in the first post, one of the tasks that can be run is to execute an agent job. Great!
The UI seems pretty clear: you just select a Job to run and it will act just like all the other tasks – in that you plug it into the rest of your plan by assigning other tasks dependencies on whether the Job succeeds, and then SQL Server will run the Agent and run subsequent tasks according to the outcome.
If you experiment, you’ll start to think that the task is treated as always succeeding:
Running this, you’ll always end up hitting A!
What’s more … you’ll discover that the email gets sent immediately, not after the 5 minutes is up! Strange, since when the first task is a backup task you don’t get the Notification until the Backup completes.
The cause (as those of you who noticed the “View T-SQL” button in the UI may have guessed) is msdb.dbo.sp_Start_Job.
Microsoft helpfully provides a built in sproc to start Agent Jobs. Unfortunately, it does exactly that: it starts the job … and then turns around and boasts about how successful it was at starting the job. sp_Start_Job executes the job asynchronously – it returns immediately and the return value is “Did the job start successfully.” (Pretty much equivalent to “Did I find a job with that name, that wasn’t already running”) It doesn’t wait for the Job to finish and it doesn’t report the success value of the Job itself!
So the Maintenance Plan Task does the same: it will immediately return a success as long as the job started at all. Whilst I concede that knowing whether the job was started is valuable, it’s not much good if you’re trying to report on whether the complete backup plan as a whole has succeeded!
I couldn’t find a way around this at all, and was eventually forced to use a custom Synchronous Job Execution sproc mainly copied from Pavel Pawlowski and then tweaked for our particular use cases. All credit, and much thanks, goes to him.
The sproc actually comprises three parts:
- A function (fnGetJobStatus) to poll a job’s status “Running”, “Succeded”, “Failed”, plus a couple of obscure cases. It works by examining msdb.dbo.SysJobActivity and msdb.dbo.SysJobHistory
- A sproc (spWaitForJob) to sit in a lazy loop waiting for a job to finish.
- And finally a sproc (spSynchControlledRunJob) to handle running agent jobs in a controllable manner (also includes a “wait for it to finish, then start a new one” option).
(The details of these three SQL scripts can be found on Pavel’s site as linked above):
Finally, the sproc could then be executed directly in the Backup Plan, since one of the other options for tasks in a Maintenance Plan is “Execute custom SQL”.
And finally, we have reliable reporting of our Backup tasks. Phew!
Hopefully these posts will save someone a load of time as they discover and probe their way through these minefields. Or alternatively, it will give you a chance for a good laugh before telling me that there’s a trivial, built-in way of doing all this stuff (I spent a reasonable amount of time looking for such a solution at the time, but perhaps my Google-fu was weak that day).
Comments on doing this sort of thing in later SQL Server versions would be particularly welcome: I looked very briefly and confirmed that the “Start Agent Jobs” still uses sp_Start_Job, but perhaps there’s an obscure, better, new way around this issue?