Products Technologies Demo Docs Blog Support Company

X19 Sneak Peek: Validating ZUGFeRD / Factur-X Invoices with TX Text Control

ZUGFeRD / Factur-X documents can be created and extracted using TX Text Control X19. This article shows how to validate these eletronic invoices by matching invoice elements with the visual PDF representation of the document.

X19 Sneak Peek: Validating ZUGFeRD / Factur-X Invoices with TX Text Control

ZUGFeRD / Factur-X

The ZUGFeRD / Factur-X standard is a hybrid electronic invoice format that consists of two parts:

  • A PDF visual, human-readable representation of the invoice.
  • An XML file that contains invoice data in a structured form that can be processed automatically.

TX Text Control X19 supports not only the embedding of attachments in PDF documents (like the XML representation), but also the extraction of the XML attachment. Another new feature in TX Text Control X19 is the possibility to search within PDF text lines in a document.

Search within PDF Documents

The namespace TXTextControl.DocumentServer.PDF.Contents contains the new class Lines that can be used to import text coordinates from a PDF document.

var clPDFInvoice = new TXTextControl.DocumentServer.PDF.Contents.Lines("ZUGFeRD.pdf");

The Find method can be used to search for strings and to get information about the location of that string:

List<ContentLine> lines = clPDFInvoice.Find("Amount");
var iPageNumber = lines[0].Page;

Other implementations of the Find method allows to search for a regular expression or to search for lines in a specific range such as a rectangle or a radius. The following code returns all lines within a radius of 200 points around a specific location and includes lines that are partically overlapping the given radius:

List<ContentLine> lines = clPDFInvoice.Find(new PointF(100,100), 200, true);

Find strings in PDFs

Validating Invoices

Now, let's bring these features together: Importing attachments and searching within the visual representation of the electronic invoice. To make our life easier, we are using a very well maintained NuGet package that implements the ZUGFeRD / Factur-X invoice object.

ZUGFeRD-csharp

The following code uses TX Text Control to extract the XML representation of the invoice in order to load it into an InvoiceDescriptor object:

private InvoiceDescriptor ImportZUGFeRD(string filename) {

  // temporary ServerTextControl to load PDF
  using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl()) {
    tx.Create();

    TXTextControl.LoadSettings ls = new TXTextControl.LoadSettings() {
      PDFImportSettings = TXTextControl.PDFImportSettings.LoadEmbeddedFiles
    };

    // load the embedded file into LoadSettings
    tx.Load(filename, TXTextControl.StreamType.AdobePDF, ls);

    // convert the byte[] to a MemoryStream
    byte[] byteArray = (byte[])ls.EmbeddedFiles[0].Data;
    MemoryStream stream = new MemoryStream(byteArray);

    // return the invoice object structure
    return InvoiceDescriptor.Load(stream);
  }
}

The method IsValidZUGFeRD is searching for 3 key values in the XML representation: Total amount, tax total amount and invoice number. These values are matched within the PDF representation. If those values match, it is highly likely that the invoice is valid. In real-world applications, you would probably connect to your ERP system to retrieve specific information about the invoice number and match more values such as addresses and line item numbers.

private bool IsValidZUGFeRD(InvoiceDescriptor invoice, Lines pdfInvoice) {
  // add key values to a list for validation
  List<string> validationValues = new List<string>();
  validationValues.Add(invoice.TaxTotalAmount.ToString(new CultureInfo("de-DE")));
  validationValues.Add(invoice.GrandTotalAmount.ToString(new CultureInfo("de-DE")));
  validationValues.Add(invoice.InvoiceNo);

  // check, if key values exist in visible PDF
  foreach (string value in validationValues) {
    if (pdfInvoice.Find(value).Count == 0)
      return false;
  }

  // all good
  return true;
}

The above methods can be called like in the following code:

// create a new invoice 
InvoiceDescriptor invoice = ImportZUGFeRD("ZUGFeRD.pdf");

// import the visible PDF
var clPDFInvoice = new TXTextControl.DocumentServer.PDF.Contents.Lines("ZUGFeRD.pdf");

// check validity
var valid = IsValidZUGFeRD(invoice, clPDFInvoice);

We are working on more features that help integrating electronic document processing into your business workflows. Our goal is to provide you with libraries to integrate the complete workflow to automate documents in your applications. Let us know what else you are looking for.

Stay tuned for more document processing features of TX Text Control X19!

Stay in the loop!

Subscribe to the newsletter to receive the latest updates.

ASP.NET

Integrate document processing into your applications to create documents such as PDFs and MS Word documents, including client-side document editing, viewing, and electronic signatures.

ASP.NET Core
Angular
Blazor
JavaScript
React
  • Angular
  • Blazor
  • React
  • JavaScript
  • ASP.NET MVC, ASP.NET Core, and WebForms

Learn more Trial token Download trial

Related Posts

ASP.NETWindows FormsWPF

Electronic Invoicing will Become Mandatory in Germany in 2025

The German government has decided to make the ability to receive electronic invoices mandatory for all B2B transactions. This article explains what this means for you and how you can prepare for…


ASP.NETWindows FormsWPF

Creating ZUGFeRD Compliant PDF Invoices in C#

ZUGFeRD / Factur-X documents can be created and extracted using TX Text Control X19. This article shows how to create a valid ZUGFeRD compliant invoice PDF document from scratch.


ASP.NETWindows FormsElectronic Documents

X19 Sneak Peek: Storing Document Revisions in PDF/A-3b

TX Text Control X19 will support the embedding of attachments to PDF/A-3b documents. This article explains a useful application for this feature that stores document revisions as attachments.


ASP.NETElectronic PaperHybrid Archiving

PDF/A-3: The Better Container for Electronic Documents

PDF is the most commonly used document format for business applications. This article gives an overview of the advantages of PDF/A-3 as an electronic container.


ASP.NETPDFPDF/A

Text Control Announces PDF/A-3 Support: The Future of Electronic Invoices

PDF/A-3 allows attachments in any format to be added to PDF documents. We are working on a general approach to embed and extract any document types to and from PDF documents.