Document Processing Web API

Implementing a Web API to create documents is a very popular and useful way to access document processing functionality from a variety of applications and clients. Depending on the template size and data structure, generating documents can be a complex, time-consuming task. When generating documents with many, nested merge blocks and 100s or 1000s of pages, this task can take several seconds or even minutes. A typical HTTP request should be returned without a huge delay within milliseconds.

A typical document such as an invoice (which takes approx. 300 ms) might be eligible to be created on-the-fly and to be returned by the HTTP request right after generation. But longer requests (in the range of seconds) should be handled in a different way.

Asynchronous Document Processing

One way to solve this problem is to create a RESTful Web API that is called with a WebHook URL that receives a notification with a download link when the document has been created successfully. The following sequence diagram shows this process in detail:

Sequence diagram: Asynchronous Document Processing

The client is sending an HttpPost request to an endpoint that immediately returns a positive response when the request is acceptable. In that request, an asynchronous task is started that generates the document. But as it is asynchronous, the client doesn't have to wait for the response. This request is stored in a database.

[Route("api/[controller]/merge")]
[HttpPost]
public ProcessingRequest Post(string webHookUrl) {
ProcessingRequest request = new ProcessingRequest() {
WebHookUrl = webHookUrl
};
request.Create(Url.ActionLink("Get", "DocumentProcessing", new { id = "1" }));
// start the document processing
Task.Run(() => TextControlProcessing.Merge(request));
// return the status immediately
return request;
}
view raw test.cs hosted with ❤ by GitHub

The ProcessingRequest object provides the data structure that is stored in the database and contains the required information to store the WebHook URL and also implements methods to store and retrieve the created document in and from the database.

Class Diagram: ProcessingRequest

Calling the WebHook

The Merge method is using the ServerTextControl TX Text Control .NET Server for ASP.NET
TXTextControl Namespace
ServerTextControl Class
The ServerTextControl class implements a component that provide high-level text processing features for server-based applications.
to create a document. In a real-world scenario, you would use the MailMerge TX Text Control .NET Server for ASP.NET
DocumentServer Namespace
MailMerge Class
The MailMerge class is a .NET component that can be used to effortlessly merge template documents with database content in .NET projects, such as ASP.NET web applications, web services or Windows services.
class to merge data into templates. For demo purposes, we simply added a Thread.Sleep() call to simulate a longer process. When the document has been created, the resulting file is stored in a database and a request is made to the WebHookUrl endpoint given by the original request from the client. The request basically tells the consumer where to download the successfully created document.

public class TextControlProcessing {
public static void Merge(ProcessingRequest request) {
using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl()) {
tx.Create();
// simulate long process
Thread.Sleep(3000);
// create document or use MailMerge to generate larger document
tx.Text = "My created document.";
byte[] data;
tx.Save(out data, TXTextControl.BinaryStreamType.AdobePDF);
request.StoreDocument(data);
}
request.Processed = true;
request.Update();
Task.Run(() => FireAndForgetWebHook(request));
}
private static void FireAndForgetWebHook(ProcessingRequest request) {
HttpClient client = new HttpClient();
var json = JsonConvert.SerializeObject(request);
var data = new StringContent(json, Encoding.UTF8, "application/json");
client.PostAsync(request.WebHookUrl, data);
}
}
view raw test.cs hosted with ❤ by GitHub

Retrieving the Document

The WebHook endpoint must be implemented by the calling client application and retrieves a URL where the finished, created document can be downloaded. In the sample code, the document is downloaded and saved in the file system as a PDF document.

[HttpPost]
public bool WebHook([FromBody] object request) {
dynamic ProcessingRequest = JObject.Parse(request.ToString());
HttpClient client = new HttpClient();
HttpResponseMessage responseMessage =
client.GetAsync(ProcessingRequest.RetrieveDocumentUrl.Value).Result;
if (responseMessage.IsSuccessStatusCode) {
string data = responseMessage.Content.ReadAsStringAsync().Result;
System.IO.File.WriteAllBytes(
"App_Data/" + ProcessingRequest.Id.Value + ".pdf", Convert.FromBase64String(data));
return true;
}
else
return true;
}
view raw hook.cs hosted with ❤ by GitHub

This endpoint is retrieving the document from the database based on the given id and returns the document:

[Route("api/[controller]/{id}")]
[HttpGet]
public string Get([FromRoute] string id) {
ProcessingRequest request = new ProcessingRequest(id);
return request.RetrieveDocument();
}
view raw test.cs hosted with ❤ by GitHub

The Sample Project

To demonstrate this concept, the sample solution consists of two separate projects:

  • tx_wp_api
    The asynchronous Web API itself.
  • tx_api_consumer
    A consuming application that calls the Web API and implements the WebHook.

For demo purposes, the solution has two starting projects. After compiling and starting the application, two browser windows are opened. The first uses Swagger to show and test the exposed Web API:

Swagger

The second window shows the consuming application that only consists of a button that calls the Web API:

Consuming app

After clicking the button Send Request, the consuming application calls the Web API to generate a document:

public IActionResult RequestWebAPI() {
var webHookUrl = Request.Scheme + "://" + Request.Host + "/Home/WebHook";
var requestUrl =
"https://localhost:7210/api/DocumentProcessing/merge?webHookUrl=" + webHookUrl;
HttpClient client = new HttpClient();
HttpResponseMessage responseMessage = client.PostAsync(requestUrl, null).Result;
return View();
}
view raw test.cs hosted with ❤ by GitHub

After the document generation process, the implemented WebHook of the consuming application is storing the created document in the App_Data folder:

Consuming app

Conclusion

For time consuming document generation processes, an asynchronous concept using WebHooks is a smart way to generate documents. Depending on the amount of requests, additions to this concept are possible. A typical extension to the above sample is the actual queuing of requests to limit the CPU and memory usage. Therefore, all requests would be queued, prioritized and processed sequentially. Whenever the document is created, the WebHook is called and informs the consumer about the successful process.