Word processing and document generation is a very complex task and TX Text Control is built to process a single document as fast as possible. TX Text Control implements critical sections that accesses shared resources and must be executed as an atomic action. This implies that multiple threads where TX Text Control is used might wait for each other under very specific circumstances. This implementation makes TX Text Control thread safe, but for specific applications slower.
In order to merge 100s or 1000s of documents in batch processes, a true multi-process implementation is recommended which increases the overall merge performance by up to 300%.
The sample shows how to merge all templates in a folder with data and to export it as Adobe PDF to another folder in a batch process.
The solution consists of two parts:
- A HostApplication that reads the files from a folder and saves the results to another folder.
- A ProcessingApplication that uses TX Text Control to process each document in a new process.
The HostApplication is calling the MergeDocument method for each loop in a .NET "parallel for each" statement:
In this method, a transportation object is created (PassingObject) that holds the document as a byte array and the merge data. The CallProcessingApp is called with this transportation object which creates a new process and communicates using anonymous pipes.
Basically, the CallProcessingApp method creates a new process, is synchronizing the pipe stream, is sending the PassingObject and waits for the synchronized return object from the process. Then the ReturningObject contains the created PDF document as a byte array.
The ProcessingApplication is referenced by the HostApplication and contains the CallProcessingApp method and the transportation data models. But the application itself is also a console application which clones itself as a new process in order to process the documents. The following code is the Main method that synchronizes the pipe stream in order to retrieve and return the transportation object and to merge the template with the given JSON data using TX Text Control:
Based on the sample templates in this demo, the normal processing of 100 templates takes about 45 seconds on a 16 core CPU while the parallel processing takes 14 seconds which is about 3 times faster than the normal processing.
The concept is very modular and flexible. By adding members to the transportation object, you could pass settings such as another return file format other than PDF or specific merge settings.
You can download this sample from our GitHub repository and try this on your own. This sample uses our Windows Forms version, but the concept is valid for all types of applications including WPF and ASP.NET.