Executing asynchronous processes in a system can enhance performance and reduce waiting times. Why not apply this concept to SQL?
A traditional example is a bank branch that runs a nightly check for all its databases, a process that is usually very time consuming. If done in threads in an asynchronous manner and in parallel, it can reduce processing time by far.
Most asynchronous processes are side operations that can be deferred to a later time, and be done in parallel; This kind of operations include writing to log file, reporting etc.
Here are some of the common use cases for using parallel processes:
Distribute workload by processing multiple logical tiers and/or multiple databases.
Lowering the duration of multiple writes inside the scope of a procedure.
Asynchronous writes to a log, audit, event or any large table with no wait time.
Performing “offline non-transactional writes” inside the scope of a transaction.
Communicating with distant database with minimum network lag.
When working with asynchronous processes, there are some constrains that should be taken into account:
Output parameters, query output, procedure return code or functions are not supported since there is no actual receiver.
You might want to resolve this problem by separating the actual back end work from the process; For example,
all processes can write to a temporary table and later, a single union query will gather the results.
Complex datatypes including user-defined read-only tables, text, image, xml and geo are usually not recommended for the procedure input as we combine parameters into xml.
You must enable service broker for the queue to process the messages. Also, you need to consider “where” the queue will be and how many “readers” the queue will support, so you can run multiple parallel runs and not only a single execution.
To enable multiple runs of procedures on different databases, instances or servers, you have to enable the trustworthy property for the queue database.
Unless you put the messages in the same queue, there is no way of knowing what procedure will be run first.
You may want to consider how the receiver will communicate with the execution procedure on errors, warnings or any responses.
Let’s give it a try
Below is a simple example of regular processing versus parallel Processing. Regular processing will be done first.
In this example:
- A database is created.
- The broker service is enabled.
- A table and two procedures are created. The two procedures add data to the table.
- We run the two procedures.
The two procedures simply add a value to the table. But before doing it, they both keep a delay of one seconds. This delay will give us a better way to estimate the performance of the procedures execution.
In order to run the two procedures in a parallel manner, we need to create a designated queue, service and receiver procedure for each procedure.
Finally, an executing procedure ProcWrapper will be created. It sends conversations into the two queues and wait until all conversations are over. This creates two processes that are working like threads:
It’s time to run the procedures and see how parallel execution makes things faster:
Running TestProc1 and TestProc2 one after the other takes obviously two seconds (each of the functions has a delay of one second). Running the wrapper function that runs TestProc1 and TestProc2 in parallel, in a thread like manner, yields a total time of one second! Meaning, we cut the time by half!
This example made asynchronous processes happen. However, it involves many steps and code that needs to be maintained. The function in this example are extremely simple. In the real world we all deal with much more complicated stuff. Further, if working on multiple servers, then the solution becomes even more complex.
Now with PartitionDB, you just need to call PdbsetASynchronizedObject, pass the database and the procedure, and parallel processing will be done for you:
The procedure TestAsyncSP is wrapping ProcTest1 and ProcTest2. Running it takes a total time of one second only!
We do understand that there are many objects involved. Therefore, we give these options to make it easier:
- A drop function – Running this function will erase all created objects. You can also set the name of this function by passing it to DropAllProcedureName parameter or keep the default one which is PdbDrop_TestAsyncSP in our case.
- An option to store all these objects in a separate database – If you wish to keep your database clean, you can pass a different database name to TargetDatabaseName parameter.
With PartitionDB functions, you can avoid all the hassle and let PartitionDB a-synced mechanism manage all parallel commands, while being greatly simple to use.
For your reference, here is a list of configuration parameters to pass to PdbsetASynchronizedObject:
|SourceDatabaseName||Database name that contain the procedure||Yes|
|SourceLinkedServerName||Linked server name for the database||No|
|SourceSchemaName||Schema name of the stored procedure||No|
|SourceObjectName||Stored procedure name||Yes|
|TargetDatabaseName||Since there are many objects created, it is recommended to use a database that will contain all the objects that are created such as: Services, Queues, the wrapper stored procedure etc||No|
|TargetLinkedServerName||Linked server name for RunAsDatabaseName||No|
|TargetSchemaName||Schema name for the wrapper procedure||No|
|TargetObjectName||Name of the procedure||No|
|DropAllProcedureName||Since lots of objects are created, we provide the option to clear them all at once by running the drop procedure. This parameter is the name of the drop procedure||No|
|ConversationTableName||Table name that will contain the conversation||No|
|ServiceName||Service name that will be created||No|
|QueueName||Queue name that will be created||No|
|WaitForDelay||Should the function exit immediately or wait for completion. If yes, pass 0. If not, pass the sampling rate to check if it has finished (in seconds)||No|
|IsOutputScript||Decide if print the command or run it||No|