X19 Sneak Peek: Validating ZUGFeRD / Factur-X Invoices with TX Text Control
ZUGFeRD / Factur-X documents can be created and extracted using TX Text Control X19. This article shows how to validate these eletronic invoices by matching invoice elements with the visual PDF representation of the document.

ZUGFeRD / Factur-X
The ZUGFeRD / Factur-X standard is a hybrid electronic invoice format that consists of two parts:
- A PDF visual, human-readable representation of the invoice.
- An XML file that contains invoice data in a structured form that can be processed automatically.
TX Text Control X19 supports not only the embedding of attachments in PDF documents (like the XML representation), but also the extraction of the XML attachment. Another new feature in TX Text Control X19 is the possibility to search within PDF text lines in a document.
Search within PDF Documents
The namespace TXTextControl.DocumentServer.PDF.Contents contains the new class Lines that can be used to import text coordinates from a PDF document.
var clPDFInvoice = new TXTextControl.DocumentServer.PDF.Contents.Lines("ZUGFeRD.pdf");
The Find method can be used to search for strings and to get information about the location of that string:
List<ContentLine> lines = clPDFInvoice.Find("Amount");
var iPageNumber = lines[0].Page;
Other implementations of the Find method allows to search for a regular expression or to search for lines in a specific range such as a rectangle or a radius. The following code returns all lines within a radius of 200 points around a specific location and includes lines that are partically overlapping the given radius:
List<ContentLine> lines = clPDFInvoice.Find(new PointF(100,100), 200, true);
Validating Invoices
Now, let's bring these features together: Importing attachments and searching within the visual representation of the electronic invoice. To make our life easier, we are using a very well maintained NuGet package that implements the ZUGFeRD / Factur-X invoice object.
The following code uses TX Text Control to extract the XML representation of the invoice in order to load it into an InvoiceDescriptor object:
private InvoiceDescriptor ImportZUGFeRD(string filename) {
// temporary ServerTextControl to load PDF
using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl()) {
tx.Create();
TXTextControl.LoadSettings ls = new TXTextControl.LoadSettings() {
PDFImportSettings = TXTextControl.PDFImportSettings.LoadEmbeddedFiles
};
// load the embedded file into LoadSettings
tx.Load(filename, TXTextControl.StreamType.AdobePDF, ls);
// convert the byte[] to a MemoryStream
byte[] byteArray = (byte[])ls.EmbeddedFiles[0].Data;
MemoryStream stream = new MemoryStream(byteArray);
// return the invoice object structure
return InvoiceDescriptor.Load(stream);
}
}
The method IsValidZUGFeRD is searching for 3 key values in the XML representation: Total amount, tax total amount and invoice number. These values are matched within the PDF representation. If those values match, it is highly likely that the invoice is valid. In real-world applications, you would probably connect to your ERP system to retrieve specific information about the invoice number and match more values such as addresses and line item numbers.
private bool IsValidZUGFeRD(InvoiceDescriptor invoice, Lines pdfInvoice) {
// add key values to a list for validation
List<string> validationValues = new List<string>();
validationValues.Add(invoice.TaxTotalAmount.ToString(new CultureInfo("de-DE")));
validationValues.Add(invoice.GrandTotalAmount.ToString(new CultureInfo("de-DE")));
validationValues.Add(invoice.InvoiceNo);
// check, if key values exist in visible PDF
foreach (string value in validationValues) {
if (pdfInvoice.Find(value).Count == 0)
return false;
}
// all good
return true;
}
The above methods can be called like in the following code:
// create a new invoice
InvoiceDescriptor invoice = ImportZUGFeRD("ZUGFeRD.pdf");
// import the visible PDF
var clPDFInvoice = new TXTextControl.DocumentServer.PDF.Contents.Lines("ZUGFeRD.pdf");
// check validity
var valid = IsValidZUGFeRD(invoice, clPDFInvoice);
We are working on more features that help integrating electronic document processing into your business workflows. Our goal is to provide you with libraries to integrate the complete workflow to automate documents in your applications. Let us know what else you are looking for.
Stay tuned for more document processing features of TX Text Control X19!
Jump to the other posts in this series:
- X19 Sneak Peek: Table of Contents
- X19 Sneak Peek: Embedded Files in Adobe PDF Documents
- X19 Sneak Peek: Integrated Barcode Support
- X19 Sneak Peek: Processing AcroForm Fields in Adobe PDF Documents
- X19 Sneak Peek: Storing Document Revisions in PDF/A-3b
- X19 Sneak Peek: Validating ZUGFeRD / Factur-X Invoices with TX Text Control
- X19 Sneak Peek: Changes for Keyboard Layout and Spell Checking
- X19 Sneak Peek: Manipulating MergeBlockInfo Objects
ASP.NET
Integrate document processing into your applications to create documents such as PDFs and MS Word documents, including client-side document editing, viewing, and electronic signatures.
- Angular
- Blazor
- React
- JavaScript
- ASP.NET MVC, ASP.NET Core, and WebForms
Related Posts
Electronic Invoicing will Become Mandatory in Germany in 2025
The German government has decided to make the ability to receive electronic invoices mandatory for all B2B transactions. This article explains what this means for you and how you can prepare for…
Creating ZUGFeRD Compliant PDF Invoices in C#
ZUGFeRD / Factur-X documents can be created and extracted using TX Text Control X19. This article shows how to create a valid ZUGFeRD compliant invoice PDF document from scratch.
ASP.NETWindows FormsElectronic Documents
X19 Sneak Peek: Storing Document Revisions in PDF/A-3b
TX Text Control X19 will support the embedding of attachments to PDF/A-3b documents. This article explains a useful application for this feature that stores document revisions as attachments.
ASP.NETElectronic PaperHybrid Archiving
PDF/A-3: The Better Container for Electronic Documents
PDF is the most commonly used document format for business applications. This article gives an overview of the advantages of PDF/A-3 as an electronic container.
Text Control Announces PDF/A-3 Support: The Future of Electronic Invoices
PDF/A-3 allows attachments in any format to be added to PDF documents. We are working on a general approach to embed and extract any document types to and from PDF documents.