Document Processing Web API
Implementing a Web API to create documents is a very popular and useful way to access document processing functionality from a variety of applications and clients. Depending on the template size and data structure, generating documents can be a complex, time-consuming task. When generating documents with many, nested merge blocks and 100s or 1000s of pages, this task can take several seconds or even minutes. A typical HTTP request should be returned without a huge delay within milliseconds.
A typical document such as an invoice (which takes approx. 300 ms) might be eligible to be created on-the-fly and to be returned by the HTTP request right after generation. But longer requests (in the range of seconds) should be handled in a different way.
Asynchronous Document Processing
One way to solve this problem is to create a RESTful Web API that is called with a WebHook URL that receives a notification with a download link when the document has been created successfully. The following sequence diagram shows this process in detail:
The client is sending an HttpPost request to an endpoint that immediately returns a positive response when the request is acceptable. In that request, an asynchronous task is started that generates the document. But as it is asynchronous, the client doesn't have to wait for the response. This request is stored in a database.
[Route("api/[controller]/merge")] | |
[HttpPost] | |
public ProcessingRequest Post(string webHookUrl) { | |
ProcessingRequest request = new ProcessingRequest() { | |
WebHookUrl = webHookUrl | |
}; | |
request.Create(Url.ActionLink("Get", "DocumentProcessing", new { id = "1" })); | |
// start the document processing | |
Task.Run(() => TextControlProcessing.Merge(request)); | |
// return the status immediately | |
return request; | |
} |
The ProcessingRequest object provides the data structure that is stored in the database and contains the required information to store the WebHook URL and also implements methods to store and retrieve the created document in and from the database.
Calling the WebHook
The Merge method is using the Server
╰ TXTextControl Namespace
╰ ServerTextControl Class
The ServerTextControl class implements a component that provide high-level text processing features for server-based applications. to create a document. In a real-world scenario, you would use the Mail
╰ DocumentServer Namespace
╰ MailMerge Class
The MailMerge class is a .NET component that can be used to effortlessly merge template documents with database content in .NET projects, such as ASP.NET web applications, web services or Windows services. class to merge data into templates. For demo purposes, we simply added a Thread.Sleep() call to simulate a longer process. When the document has been created, the resulting file is stored in a database and a request is made to the WebHookUrl endpoint given by the original request from the client. The request basically tells the consumer where to download the successfully created document.
public class TextControlProcessing { | |
public static void Merge(ProcessingRequest request) { | |
using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl()) { | |
tx.Create(); | |
// simulate long process | |
Thread.Sleep(3000); | |
// create document or use MailMerge to generate larger document | |
tx.Text = "My created document."; | |
byte[] data; | |
tx.Save(out data, TXTextControl.BinaryStreamType.AdobePDF); | |
request.StoreDocument(data); | |
} | |
request.Processed = true; | |
request.Update(); | |
Task.Run(() => FireAndForgetWebHook(request)); | |
} | |
private static void FireAndForgetWebHook(ProcessingRequest request) { | |
HttpClient client = new HttpClient(); | |
var json = JsonConvert.SerializeObject(request); | |
var data = new StringContent(json, Encoding.UTF8, "application/json"); | |
client.PostAsync(request.WebHookUrl, data); | |
} | |
} |
Retrieving the Document
The WebHook endpoint must be implemented by the calling client application and retrieves a URL where the finished, created document can be downloaded. In the sample code, the document is downloaded and saved in the file system as a PDF document.
[HttpPost] | |
public bool WebHook([FromBody] object request) { | |
dynamic ProcessingRequest = JObject.Parse(request.ToString()); | |
HttpClient client = new HttpClient(); | |
HttpResponseMessage responseMessage = | |
client.GetAsync(ProcessingRequest.RetrieveDocumentUrl.Value).Result; | |
if (responseMessage.IsSuccessStatusCode) { | |
string data = responseMessage.Content.ReadAsStringAsync().Result; | |
System.IO.File.WriteAllBytes( | |
"App_Data/" + ProcessingRequest.Id.Value + ".pdf", Convert.FromBase64String(data)); | |
return true; | |
} | |
else | |
return true; | |
} |
This endpoint is retrieving the document from the database based on the given id and returns the document:
[Route("api/[controller]/{id}")] | |
[HttpGet] | |
public string Get([FromRoute] string id) { | |
ProcessingRequest request = new ProcessingRequest(id); | |
return request.RetrieveDocument(); | |
} |
The Sample Project
To demonstrate this concept, the sample solution consists of two separate projects:
- tx_wp_api
The asynchronous Web API itself. - tx_api_consumer
A consuming application that calls the Web API and implements the WebHook.
For demo purposes, the solution has two starting projects. After compiling and starting the application, two browser windows are opened. The first uses Swagger to show and test the exposed Web API:
The second window shows the consuming application that only consists of a button that calls the Web API:
After clicking the button Send Request, the consuming application calls the Web API to generate a document:
public IActionResult RequestWebAPI() { | |
var webHookUrl = Request.Scheme + "://" + Request.Host + "/Home/WebHook"; | |
var requestUrl = | |
"https://localhost:7210/api/DocumentProcessing/merge?webHookUrl=" + webHookUrl; | |
HttpClient client = new HttpClient(); | |
HttpResponseMessage responseMessage = client.PostAsync(requestUrl, null).Result; | |
return View(); | |
} |
After the document generation process, the implemented WebHook of the consuming application is storing the created document in the App_Data folder:
Conclusion
For time consuming document generation processes, an asynchronous concept using WebHooks is a smart way to generate documents. Depending on the amount of requests, additions to this concept are possible. A typical extension to the above sample is the actual queuing of requests to limit the CPU and memory usage. Therefore, all requests would be queued, prioritized and processed sequentially. Whenever the document is created, the WebHook is called and informs the consumer about the successful process.