When you get the task of creating PDF documents in application development, you do your research and find many different tools to help you with this job. Of course, you can try to create the PDF yourself by studying the ~1000 pages of the latest PDF 2.0 specification (ISO 32000:2:2020), which contains all the technical details about the PDF file format, rendering, encryption, and other features. But let's face reality: This is a huge time commitment and there are entire companies that take care of this (like us - Text Control).

The first step is to determine what type of PDF documents you want to create and, most importantly, how many. This decision is critical when it comes to the required flexibility of the PDF creation process. There are many other aspects that should be part of the decision making process.

  • Do you need to create PDFs from scratch or do you have existing documents that you can reuse?
  • How many different types of documents do you need to create (and will you need to create in the future!)?
  • Which role in the organization should be able to make changes to documents or templates, such as logo changes?

These three questions alone start a complete thought process about how modular and easy to maintain the document generation process should be.

HTML-to-PDF Conversion

A very common idea of developers is to use tools to convert HTML to PDF. The reasons sound right, because as developers we know how to write HTML and feel confident with HTML and CSS styling. So the basic idea of this process is that developers create the HTML with CSS and converters create the PDF from that input. Popular libraries include wkhtmltopdf, Puppeteer, and Headless Chrome.

Pros:

  • Familiar Markup: Easy to create HTML and CSS.
  • Dynamic Content: Easy to create dynamic content.
  • Quick Results: Fast to implement.

Cons:

  • Inconsistent Rendering: HTML-to-PDF converters can struggle with consistent layout and styling due to differences in how rendering engines interpret HTML/CSS.
  • Complex Layouts: HTML is not designed for print precision, so it can be difficult to achieve precise positioning, alignment, or page-specific layouts (such as fixed headers and footers).
  • Performance: Complex or large HTML documents can slow down the PDF creation process.

HTML-to-PDF conversion is a good choice for simple documents or when you need to create a PDF from a Web page. But when it comes to complex documents such as invoices, reports, or contracts, HTML-to-PDF converters have their limitations.

Programmatic PDF Generation

After reading about the limitations of the HTML approach, it may seem obvious to use an approach that positions elements more precisely.

Programmatic PDF creation is the process of creating PDF documents using a programming interface. This approach is more flexible and allows you to create complex documents with precise layout and styling. In your code, you basically position certain elements on x and y coordinates on a page.

Pros:

  • Fine Control: Precise positioning of elements on a page.
  • Customization: Full control over the layout and styling of the document.
  • Consistency: Because layouts are coded, there's less risk of inconsistencies between platforms.

Cons:

  • Labor-Intensive: Developers have to manually define and position each element, which can be a time-consuming process for complex documents.
  • Requires Programming Skills: Non-technical users can't create or customize templates, so any change requires developer intervention.
  • Manual Pagination: Developers must explicitly handle page breaks and overflow content, adding complexity to the coding process.

Programmatic PDF generation is a good choice for complex documents that require precise layout and styling. However, it can be time-consuming and requires programming skills to create and maintain templates.

The following code snippet uses QuestPDF, an open source .NET library for creating PDF documents. The code basically adds text to a header, and you can already see that this provides a flexible way, but also that all elements must be positioned programmatically, and the static text is also baked into the code.

void ComposeHeader(IContainer container)
{
var titleStyle = TextStyle.Default.FontSize(20).SemiBold().FontColor(Colors.Blue.Medium);
container.Row(row =>
{
row.RelativeItem().Column(column =>
{
column.Item().Text($"Invoice #{Model.InvoiceNumber}").Style(titleStyle);
column.Item().Text(text =>
{
text.Span("Issue date: ").SemiBold();
text.Span($"{Model.IssueDate:d}");
});
column.Item().Text(text =>
{
text.Span("Due date: ").SemiBold();
text.Span($"{Model.DueDate:d}");
});
});
row.ConstantItem(100).Height(50).Placeholder();
});
}
view raw test.cs hosted with ❤ by GitHub

Higher Maintenance and Technical Debt

Because layout logic in programmatic PDFs is embedded in the code, any template adjustment requires a code change, even for minor tweaks. This results in:

  • Longer feedback loops for design changes.
  • Dependence on developers for template adjustments, limiting flexibility.
  • Higher maintenance costs, as updates to templates or business rules require ongoing development time.

Template-Based PDF Generation

Template-based systems allow users to visually design PDF templates, often with a drag-and-drop interface that closely resembles a word processor. This setup allows for WYSIWYG (What You See Is What You Get) design, where the template looks exactly like the final output.

TX Text Control can be used not only to program PDFs from scratch, as described in the second approach, but also the most flexible approach of using WYSIWYG templates. TX Text Control comes with a full-featured, customizable, and programmable document editor that can be integrated into a web application to allow non-technical users to create pixel-perfect templates. The SDK also provides a non-UI engine that can be fully embedded in applications that take this template and merge data from various data sources into the template.

This concept allows you to provide your users with a very easy-to-use interface for creating templates, but also gives you full flexibility in the merge process.

Pros:

  • User-Friendly Design: WYSIWYG design tools allow non-developers to create templates with ease, making it accessible to a wider range of users.
  • Dynamic and Consistent Layout: Template-based systems automatically manage page breaks and content overflow, ensuring consistent layout without the need for custom code.
  • High Fidelity Output: Users can see exactly how their document will look, reducing the need for testing and revisions.
  • MS Word Compatibility: Templates can be imported from and exported to Microsoft Word, making it easy to reuse existing documents.

Cons:

  • Learning Curve: Developers need to learn how to integrate the template system into their application.

TX Text Control combines powerful dynamic document generation features such as merge fields, repeating and conditional merge blocks with easy-to-use template design.

The following code uses TX Text Control to load a pre-designed template and merge JSON data into it to create a pixel-perfect PDF.

using TXTextControl.DocumentServer.Fields;
using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl())
{
tx.Create();
TXTextControl.LoadSettings ls = new TXTextControl.LoadSettings()
{
ApplicationFieldFormat = TXTextControl.ApplicationFieldFormat.MSWord,
LoadSubTextParts = true
};
tx.Load("template.docx", TXTextControl.StreamType.WordprocessingML, ls);
using (TXTextControl.DocumentServer.MailMerge mailMerge =
new TXTextControl.DocumentServer.MailMerge())
{
var jsonData = System.IO.File.ReadAllText("data.json");
mailMerge.TextComponent = tx;
mailMerge.MergeJsonData(jsonData);
}
tx.Save("output.pdf", TXTextControl.StreamType.AdobePDF);
}
view raw test.cs hosted with ❤ by GitHub

Learn More

A very efficient way to create pixel-perfect, error-free documents in an automated process is to create PDFs from MS Word templates. This article shows how to use the TX Text Control libraries to merge MS Word templates with JSON data to generate PDFs in an ASP.NET Core application.

Generate PDF Documents from MS Word DOCX Templates in ASP.NET Core C#

Conclusion

Each of these methods serves a different purpose, and the choice often depends on the type of document, the complexity of the layout, and the technical expertise available. Solutions such as TX Text Control, with its WYSIWYG design and flexible data merging, provide an ideal middle ground, combining the visual simplicity of template design with robust PDF generation capabilities.