When building document processing APIs, it's easy to fall into the trap of cramming too much logic into a single controller method. But as your application grows, the need for maintainability, testability, and clarity becomes critical.

Here's a guide to building a clean, scalable, and testable PDF merging API using TX Text Control, C#, and ASP.NET Core on Linux.

Why Clean Architecture?

Clean architecture emphasizes separation of concerns, keeping your business logic independent of frameworks, UI, and infrastructure. The goal is to:

  • Improve maintainability
  • Enable unit testing
  • Allow for future extensibility

Let's look at how these principles apply to a TX Text Control based document merge API.

Folder Structure

The project structure is based on the default Visual Studio project template for creating an ASP.NET Core Web API application. The project is structured as follows:

/Controllers
    DocumentController.cs
/Services
    IDocumentMergeService.cs
    DocumentMergeService.cs
/Utilities
    MailMergeMapper.cs
/Models
    MergeBody.cs

Core Components

Let's take a look at the core components responsible for the application merge process:

  • DocumentController

    The controller is responsible for handling incoming requests and delegating the work to the service layer.

    [ApiController]
    [Route("[controller]")]
    public class DocumentController : ControllerBase
    {
    private readonly IDocumentMergeService _documentMergeService;
    public DocumentController(IDocumentMergeService documentMergeService)
    {
    _documentMergeService = documentMergeService;
    }
    [HttpPost("merge")]
    public IActionResult Merge([FromBody] MergeBody mergeBody)
    {
    if (mergeBody == null || mergeBody.Template == null)
    {
    return BadRequest("Invalid input. Template is required.");
    }
    try
    {
    var result = _documentMergeService.MergeDocument(mergeBody);
    return Ok(result);
    }
    catch (Exception ex)
    {
    return BadRequest(ex.Message);
    }
    }
    }
    view raw test.cs hosted with ❤ by GitHub
  • IDocumentMergeService

    The service layer handles the business logic. The service layer is independent of the controller and can be reused in other parts of the application. It is an interface to abstract the merge logic and promote testability.

    public interface IDocumentMergeService
    {
    string MergeDocument(MergeBody mergeBody);
    }
    view raw test.cs hosted with ❤ by GitHub
  • DocumentMergeService

    The service implementation is where the actual merge logic resides. The service is responsible for merging the documents using TX Text Control and returning the result.

    public class DocumentMergeService : IDocumentMergeService
    {
    public string MergeDocument(MergeBody mergeBody)
    {
    using var tx = new ServerTextControl();
    tx.Create();
    tx.Load(Convert.FromBase64String(mergeBody.Template), BinaryStreamType.InternalUnicodeFormat);
    var mailMerge = MailMergeMapper.ToMailMerge(mergeBody, tx);
    mailMerge.MergeJsonData(mergeBody.MergeData, true);
    var saveSettings = MailMergeMapper.ToSaveSettings(mergeBody.MergeSettings);
    tx.Save(out byte[] result, BinaryStreamType.AdobePDF, saveSettings);
    return Convert.ToBase64String(result);
    }
    }
    view raw test.cs hosted with ❤ by GitHub
  • MailMergeMapper

    The MailMergeMapper class is responsible for mapping the incoming JSON payload to a to MailMerge configuration. It basically returns a MailMerge object.

    public static class MailMergeMapper
    {
    public static MailMerge ToMailMerge(MergeBody body, TXTextControl.ServerTextControl tx)
    {
    return new MailMerge
    {
    TextComponent = tx,
    DataSourceCulture = new CultureInfo(body.MergeSettings.DataSourceCulture),
    FormFieldMergeType = (FormFieldMergeType)body.MergeSettings.FormFieldMergeType,
    MergeCulture = new CultureInfo(body.MergeSettings.MergeCulture),
    RemoveEmptyBlocks = body.MergeSettings.RemoveEmptyBlocks ?? false,
    RemoveEmptyFields = body.MergeSettings.RemoveEmptyFields ?? false,
    RemoveEmptyImages = body.MergeSettings.RemoveEmptyImages ?? false,
    RemoveEmptyLines = body.MergeSettings.RemoveEmptyLines ?? false,
    RemoveTrailingWhitespace = body.MergeSettings.RemoveTrailingWhitespace ?? false
    };
    }
    public static TXTextControl.SaveSettings ToSaveSettings(MergeSettings settings)
    {
    return new TXTextControl.SaveSettings
    {
    Author = settings.Author,
    CreationDate = (DateTime)settings.CreationDate,
    CreatorApplication = settings.CreatorApplication,
    DocumentSubject = settings.DocumentSubject,
    DocumentTitle = settings.DocumentTitle,
    LastModificationDate = (DateTime)settings.LastModificationDate
    };
    }
    }
    view raw test.cs hosted with ❤ by GitHub
  • MergeBody

    The MergeBody class defines the incoming data contract. It is used to deserialize the incoming JSON payload.

    public class MergeBody
    {
    public string MergeData { get; set; }
    public string Template { get; set; }
    public MergeSettings MergeSettings { get; set; }
    }
    public class DocumentSettings
    {
    public string Author { get; set; }
    public DateTime? CreationDate { get; set; }
    public string CreatorApplication { get; set; }
    public string DocumentSubject { get; set; }
    public string DocumentTitle { get; set; }
    public DateTime? LastModificationDate { get; set; }
    }
    public class MergeSettings : DocumentSettings
    {
    public bool? RemoveEmptyFields { get; set; }
    public bool? RemoveEmptyBlocks { get; set; }
    public bool? RemoveEmptyImages { get; set; }
    public bool? RemoveTrailingWhitespace { get; set; }
    public bool? RemoveEmptyLines { get; set; }
    public int? FormFieldMergeType { get; set; }
    public string Culture { get; set; }
    public string DataSourceCulture { get; set; }
    public string MergeCulture { get; set; }
    }
    view raw test.cs hosted with ❤ by GitHub

Single Responsibility Principle (SRP)

Each class has a single responsibility. The controller is responsible for handling HTTP requests, the service is responsible for the merge logic, and the mapper is responsible for mapping the incoming JSON payload to a MailMerge object.

Class Responsibility
DocumentController Handles HTTP requests and responses.
DocumentMergeService Manages document merging logic.
MailMergeMapper Converts DTOs (MergeBody) into MailMerge objects.
MergeBody Define the structure of the request body.

Consuming the API

The API can be consumed with a simple HTTP POST request. The request body is a JSON object containing the merge data and template file. The response is a base64 encoded string of the merged document.

In the example project, we use Swagger to easily test the endpoint. Start the application and navigate to /swagger to test the API.

PDF Merge Engine Web API

  1. Click on the POST operation.
  2. Click on Try it out.
  3. Copy and paste the following JSON payload into the request body:

    {
    "mergeData":"{\"FirstName\":\"John\",\"LastName\":\"Doe\"}",
    "template":"CAcBAA4AAAAAAAAAAAAXAAIAqwBGAGkAcgBzAHQATgBhAG0AZQC7AHQAZQB4AHQAQwBvAG4AdAByAG8AbAAxAAAANgIAAAMAAQABAAEAAAAAAAAAAgCfhwEAAQALAAAAAAAAQAEAkgcAAABQAQAMAAAAAAAAAABAAAAAAAAAAFABAAwADAAAAAAAAEAAAAAAAAAAUDj/AAAAAAAAkAEAAAAAAAACIkFyaWFsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQABABgAAAAAAAAAAAAAAAAAZAAgAg8AAAABbgQB3AgBSg0BuBEBJhYBlBoBAh8BcCMB3icBTCwBujABKDUBljkBBD4BAAAAAAAAAAAAFAAAAEYAaQByAHMAdABOAGEAbQBlAAAAAQAHAAAAAAAAACwAAABNAEUAUgBHAEUARgBJAEUATABEAAAARgBpAHIAcwB0AE4AYQBtAGUAAAAAAAAAAAAAAAAAAAAAAAAAAABBAHIAaQBhAGwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOQEAAABAAEACQYAAQAAAC4AAP//AAAAALcAAQAAAABAAAAAUAEAAgAJBAAAAAA8AABkAAAAAAEAAAAJBAAAAAAAAABkAAAAAAEAAAAJBAAAAAAAAABkAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAEAAQABABgAAAAAAAAAAAAAAAAAAAAAAAEAUwB5AG0AYgBvAGwAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIQBAAAwAAQA0C8AAOA9AACgBaAFoAWgBQAAAEAAAAAAAAAAAAEAAAABAA4AAAAAAAAAAAAkAQAAAQAAAAAAOP8AAAAAAACQAQAAAAAAAAIiQXJpYWwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQQByAGkAYQBsAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAAAAFAAAAAAAAAAAAAAAABkACACDwAAAAFuBAHcCAFKDQG4EQEmFgGUGgECHwFwIwHeJwFMLAG6MAEoNQGWOQEEPgEAAQAJBgABAAAALgAA//8AAAAAtwABAAAAAEAAAABQAAASAAAAAAAAAAAAAAAAAAAAAAAAAAAACQRkAAAAWwBOAG8AcgBtAGEAbABdAAAAUwB5AG0AYgBvAGwAAAAAAABAIAABAAMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQB6AAAAMgAAAABAAAAAADcCNwI3AjcC0C8AAOA9AACgBaAFoAWgBSAKP4AAAAAAAAAAAAEAAAAAAAAAGwEAAAAAAAAAAAAAAAAAAAAAAAAsAAAAAAAAAAAAxgHGAcYBxgEAAABAAAAAQAAAAEAAAABAAAAAAAAAAAAAAA==",
    "mergeSettings":{
    "removeEmptyFields":true,
    "removeEmptyBlocks":true,
    "removeEmptyImages":true,
    "removeTrailingWhitespace":true,
    "removeEmptyLines":true,
    "formFieldMergeType":1,
    "culture":"en-US",
    "dataSourceCulture":"en-US",
    "mergeCulture":"en-US",
    "author":"John Doe",
    "creationDate":"2024-07-12T14:44:27.7222043+02:00",
    "creatorApplication":"TX Text Control",
    "documentSubject":"Subject",
    "documentTitle":"Title",
    "lastModificationDate":"2024-07-12T14:44:27.7244266+02:00"
    }
    }
    view raw test.json hosted with ❤ by GitHub
  4. Click on Execute.

    PDF Merge Engine Web API

  5. The response is a base64-encoded string of the merged document. You can use this string to save the document to a file or display it in a viewer.

    PDF Merge Engine Web API

Deployment

Deploying the application on a Windows or Linux server is easy. The application is built using .NET Core, which is cross-platform. You can deploy the application to a Linux server using Docker or directly to the server.

Conclusion

Building a clean, scalable, and testable PDF merge API using TX Text Control, C#, and ASP.NET Core on Linux is easy. By following the principles of clean architecture, you can build a maintainable and extensible API that is easy to test and deploy.

Download the complete source code from GitHub.