ZUGFeRD / Factur-X

The ZUGFeRD / Factur-X standard is a hybrid electronic invoice format that consists of two parts:

  • A PDF visual, human-readable representation of the invoice.
  • An XML file that contains invoice data in a structured form that can be processed automatically.

TX Text Control X19 supports not only the embedding of attachments in PDF documents (like the XML representation), but also the extraction of the XML attachment. Another new feature in TX Text Control X19 is the possibility to search within PDF text lines in a document.

Search within PDF Documents

The namespace TXTextControl.DocumentServer.PDF.Contents contains the new class Lines that can be used to import text coordinates from a PDF document.


The Find method can be used to search for strings and to get information about the location of that string:


Other implementations of the Find method allows to search for a regular expression or to search for lines in a specific range such as a rectangle or a radius. The following code returns all lines within a radius of 200 points around a specific location and includes lines that are partically overlapping the given radius:


Find strings in PDFs

Validating Invoices

Now, let's bring these features together: Importing attachments and searching within the visual representation of the electronic invoice. To make our life easier, we are using a very well maintained NuGet package that implements the ZUGFeRD / Factur-X invoice object.


The following code uses TX Text Control to extract the XML representation of the invoice in order to load it into an InvoiceDescriptor object:


The method IsValidZUGFeRD is searching for 3 key values in the XML representation: Total amount, tax total amount and invoice number. These values are matched within the PDF representation. If those values match, it is highly likely that the invoice is valid. In real-world applications, you would probably connect to your ERP system to retrieve specific information about the invoice number and match more values such as addresses and line item numbers.


The above methods can be called like in the following code:


We are working on more features that help integrating electronic document processing into your business workflows. Our goal is to provide you with libraries to integrate the complete workflow to automate documents in your applications. Let us know what else you are looking for.

Stay tuned for more document processing features of TX Text Control X19!