I may only be an expert of sorts in computer information systems as opposed to a computer scientist, and one specializing in algorithms at that, but I’d like to think I know the difference between iterating on row set that diminishes with each iteration, and set constrained self calling operation. I am also pretty sure that iterating on a row set that diminishes with each iteration is parent form of a for each type iteration. I do know that SQL Server 2012 introduces for us a deployment model that makes the business of building recursive SSIS packages wholly more tenable.
The project based deployment model allows us to reference our packages in development. This is a welcome change as it allows us to make better use of a more modular approach to package development. One of the ways that this is manifested is in the Execute package task. Instead of creating a connection manager of either a SQL Server or File System type, you can use an internal reference. In SQL Server 2012 an execute package task can be used to call a package that is either internally referenced in the same project as the parent package, or externally referenced. An externally referenced package is essentially the only reference type in SQL 2008 R2. This requires the use of a connection manager which will need to change if we are developing anywhere other than our production environment, which if you do, you really shouldn’t.
Beyond being able to call packages stored in the project we are working in the internally referenced package call allows us to do something that is simply amazing. We can call the package we are building. That means that we can have a child package that can modify the parent package’s variable values as they are the same variables. In the example I have created a package that reads a list of packages and executes them asynchronously. I included annotations to help keep this relatively simple package conceptually simple.
The package begins with an Execute SQL Task. I use an expression(which we can see on the design space, new to SSDT) to set the disable parameter of this task. This is simply a test of the initialization parameter. If we are making the initial call to this package the parameter defaults to 1 which tells the populate Args SQL Task whether or not to run. For fun I pass the actual SQL to run to it through another parameter. Actually, it is quite a bit more than just fun but we will have to discuss environments, parameters and programmatic execution in package stores in later posts.
The next task is a scripting task in which I manipulate the array (I need to be mum on the specifics of this for various legal reasons I err on the side of caution for). I achieve two primary things by this task. The first and more obvious item accomplished is popping first argument off of the Args stack. The second is checking that there is at least one row left in the set (there are a lot of ways to skin this cat). If there isn’t we change our initialization value to -1. Otherwise we set it to continue.
Next we use a sequence container to assure that the previous steps have executed before we execute the remaining two in parallel. If we had an array length of zero for our Args set the -1 initialization parameter would disable this container and prevent the execution of its children members. Once it knows it’s disabled, the package completes its execution.
The next task is an execute package task. This simply executes the package we are in. We have access to and set the same variables with our actions. Since we have enforced precedence outside of the container and set our next Args and already disabled our Args population SQL Task, this task will inevitably lap our work task, allowing us to start all of our work without waiting for any of it to complete. If any of our work items are such that they require precedence constraints between them, we can engineer some sort of messaging service for them, or build parent packages that enforce the precedence between those packages and call them instead of their children.
Our final task is the one that performs the actual work. In this case I chose an execute process task that calls DTEXEC. Our Args set member retrieved in our script task is one of the parameters for the DTEXEC calls to packages that may have a different deployment model or project than our recursion. We could of just as easily used this approaches to spawn workers to operate on separate folders or servers.