Products Technologies Demo Docs Blog Support Company
TX Text Control 34.0 SP1 has been released - Learn more

Validating PDF/UA Documents in .NET C#

Creating accessible and compliant PDF documents is becoming an increasingly important requirement across industries. In this blog post, we explore how to validate PDF/UA documents using Text Control's .NET libraries, ensuring your generated PDFs meet accessibility standards with ease.

Validating PDF/UA Documents in .NET C#

Creating accessible and compliant PDF documents is becoming an increasingly important requirement across industries. TX Text Control 34.0 will allow developers to directly generate PDF/UA and PDF/A-3a documents, which is a significant advancement for long-term, compliant document archiving.

What Are PDF/UA and PDF/A-3a?

PDF/UA (Universal Accessibility) establishes standards that ensure PDF documents are accessible to all, including those using assistive technologies, such as screen readers.

PDF/UA ensures that documents' structures, reading orders, and element descriptions are correctly defined so all content can be understood semantically.

PDF/A-3a, on the other hand, is part of the ISO-standardized family of archival PDF formats. It guarantees that a document can be reproduced exactly as it is today, including embedded attachments and accessible content ("a" stands for "accessibility").

Both standards require that documents contain a logical structure, semantic tagging, and metadata that accurately describes the content.

Why Validation Is Essential

During the implementation of PDF/UA generation or when designing templates, some tags or descriptive elements often go missing or are applied incorrectly. Even if a document appears correct visually, it may not meet accessibility or archival standards and therefore fail compliance checks.

For example:

  • A figure might be missing a descriptive alternative text.
  • A table might lack proper header definitions.
  • The reading order or tag hierarchy might be broken.
  • Metadata such as language or document title might not be set.

Without validation, these issues can easily go unnoticed.

PDF/UA Validation

With version 34.0, we will introduce a validation library designed to help developers check the compliance of generated PDF documents.

This library analyzes:

  • The document structure tree and tagging hierarchy
  • Metadata and language settings
  • Descriptive texts for tables, figures, form fields and hyperlinks
  • Table header and data cell relations
  • And other accessibility-related properties required by the PDF/UA specification

It generates detailed reports in a structured JSON format, as well as textual output for console applications. This allows developers to integrate validation directly into automated testing or quality assurance (QA) pipelines.

Example Usage in C#

Here is a simple example of how to use the validation library in a C# application:

using TXTextControl.PDF.Validation;

var report = PdfUaValidator.Validate("documents/hyperlink.pdf");
report.PrintText();

In this example, we first create a validator instance and then validate the document. The results are printed to the console and can be serialized to JSON for further analysis.

Console output PDF/UA validation

The resulting JSON report provides a detailed overview of any compliance issues found in the document:

{
  "filePath": "documents/hyperlink.pdf",
  "pdfVersion": "1.7",
  "isPass": true,
  "documentTitle": "This is a sample PDF/UA document",
  "documentLanguage": "en-US",
  "findings": [
    {
      "ruleId": "UA-CONFORMANCE",
      "severity": "Info",
      "passed": true,
      "message": "PDF/UA-1 conformance declaration found in XMP."
    },
    {
      "ruleId": "PDFA-CONFORMANCE",
      "severity": "Info",
      "passed": true,
      "message": "PDF/A-3A declaration found in XMP."
    },
    {
      "ruleId": "PDF-HEADER",
      "severity": "Error",
      "passed": true,
      "message": "Found PDF header %PDF-1.7."
    },
    {
      "ruleId": "PDF-XREF",
      "severity": "Warning",
      "passed": true,
      "message": "Cross-reference table/stream appears present."
    },
    {
      "ruleId": "UA-CATALOG",
      "severity": "Error",
      "passed": true,
      "message": "Catalog dictionary present."
    },
    {
      "ruleId": "UA-MARKED",
      "severity": "Error",
      "passed": true,
      "message": "/MarkInfo \u003C\u003C /Marked true \u003E\u003E found (Tagged PDF)."
    },
    {
      "ruleId": "UA-STRUCT",
      "severity": "Error",
      "passed": true,
      "message": "/StructTreeRoot present."
    },
    {
      "ruleId": "UA-MCID-ANCHOR",
      "severity": "Info",
      "passed": true,
      "message": "Marked content (/MCID) present and at least one page has /StructParents anchors."
    },
    {
      "ruleId": "UA-TEXT-MAPPING",
      "severity": "Info",
      "passed": true,
      "message": "Font ToUnicode maps present (text is likely accessible)."
    },
    {
      "ruleId": "UA-LANG",
      "severity": "Error",
      "passed": true,
      "message": "/Lang present at document/page level."
    },
    {
      "ruleId": "UA-METADATA",
      "severity": "Warning",
      "passed": true,
      "message": "XMP metadata packet detected."
    },
    {
      "ruleId": "UA-TITLE",
      "severity": "Error",
      "passed": true,
      "message": "Document title found (Info or XMP dc:title)."
    },
    {
      "ruleId": "UA-TABS",
      "severity": "Warning",
      "passed": true,
      "message": "Page /Tabs setting present."
    },
    {
      "ruleId": "UA-FIG-ALT",
      "severity": "Info",
      "passed": true,
      "message": "Figures detected: 3; descriptive text tokens (/Alt or /ActualText): 3."
    },
    {
      "ruleId": "UA-LINK-DESC",
      "severity": "Info",
      "passed": true,
      "message": "Links: 2; all appear to have nearby tooltip/contents/ActualText."
    },
    {
      "ruleId": "UA-FORMS-TU",
      "severity": "Info",
      "passed": true,
      "message": "AcroForm detected; tooltips (/TU) count: 3."
    },
    {
      "ruleId": "UA-TABLE-A-SUMMARY",
      "severity": "Info",
      "passed": true,
      "message": "Tables: 3; all have /A with /Summary."
    },
    {
      "ruleId": "UA-TABLE-HEADERS",
      "severity": "Info",
      "passed": true,
      "message": "Tables with headers: OK=1, missing/invalid=0."
    }
  ],
  "tableSummaries": [
    {
      "index": 1,
      "summaryText": "Table description",
      "summaryRaw": "(\u00FE\u00FF\u0000T\u0000a\u0000b\u0000l\u0000e\u0000 \u0000d\u0000e\u0000s\u0000c\u0000r\u0000i\u0000p\u0000t\u0000i\u0000o\u0000n)",
      "hasOTable": true,
      "source": "Obj 58: A 74 0 R",
      "thTotal": 3,
      "thWithScope": 3,
      "tdWithHeaders": 0,
      "headersOk": true,
      "headersApplicable": true
    },
    {
      "index": 2,
      "summaryText": "Inner table",
      "summaryRaw": "(\u00FE\u00FF\u0000I\u0000n\u0000n\u0000e\u0000r\u0000 \u0000t\u0000a\u0000b\u0000l\u0000e)",
      "hasOTable": true,
      "source": "Obj 59: A 96 0 R",
      "thTotal": 0,
      "thWithScope": 0,
      "tdWithHeaders": 0,
      "headersOk": true,
      "headersApplicable": false
    },
    {
      "index": 3,
      "summaryText": "Third table",
      "summaryRaw": "(\u00FE\u00FF\u0000T\u0000h\u0000i\u0000r\u0000d\u0000 \u0000t\u0000a\u0000b\u0000l\u0000e)",
      "hasOTable": true,
      "source": "Obj 60: A 122 0 R",
      "thTotal": 0,
      "thWithScope": 0,
      "tdWithHeaders": 0,
      "headersOk": true,
      "headersApplicable": false
    }
  ],
  "links": [
    {
      "index": 1,
      "linkText": "Descriptive Text",
      "linkTextRaw": "(\u00FE\u00FF\u0000D\u0000e\u0000s\u0000c\u0000r\u0000i\u0000p\u0000t\u0000i\u0000v\u0000e\u0000 \u0000T\u0000e\u0000x\u0000t)",
      "targetType": "URI",
      "targetValue": "http://www.textcontrol.com",
      "targetRaw": "(http://www.textcontrol.com)",
      "source": "Annot window"
    },
    {
      "index": 2,
      "linkText": "Descriptive Text",
      "linkTextRaw": "(\u00FE\u00FF\u0000D\u0000e\u0000s\u0000c\u0000r\u0000i\u0000p\u0000t\u0000i\u0000v\u0000e\u0000 \u0000T\u0000e\u0000x\u0000t)",
      "targetType": "URI",
      "targetValue": "http://www.textcontrol.com",
      "targetRaw": "(http://www.textcontrol.com)",
      "source": "Annot window"
    }
  ],
  "figures": [
    {
      "index": 1,
      "altText": "image in  table",
      "altRaw": "(\u00FE\u00FF\u0000i\u0000m\u0000a\u0000g\u0000e\u0000 \u0000i\u0000n\u0000 \u0000 \u0000t\u0000a\u0000b\u0000l\u0000e)",
      "source": "Figure obj 55"
    },
    {
      "index": 2,
      "altText": "Barcode not in table",
      "altRaw": "(\u00FE\u00FF\u0000B\u0000a\u0000r\u0000c\u0000o\u0000d\u0000e\u0000 \u0000n\u0000o\u0000t\u0000 \u0000i\u0000n\u0000 \u0000t\u0000a\u0000b\u0000l\u0000e)",
      "source": "Figure obj 56"
    },
    {
      "index": 3,
      "altText": "Image description",
      "altRaw": "(\u00FE\u00FF\u0000I\u0000m\u0000a\u0000g\u0000e\u0000 \u0000d\u0000e\u0000s\u0000c\u0000r\u0000i\u0000p\u0000t\u0000i\u0000o\u0000n)",
      "source": "Figure obj 57"
    }
  ],
  "forms": [
    {
      "index": 1,
      "fieldName": "list item",
      "fieldNameRaw": "(list item)",
      "fieldType": "Ch",
      "tooltip": "list item",
      "tooltipRaw": "(list item)",
      "source": "Obj 10"
    },
    {
      "index": 2,
      "fieldName": "company_name",
      "fieldNameRaw": "(company_name)",
      "fieldType": "Tx",
      "tooltip": "company_name",
      "tooltipRaw": "(company_name)",
      "source": "Obj 13"
    },
    {
      "index": 3,
      "fieldName": "is_client",
      "fieldNameRaw": "(is_client)",
      "fieldType": "Btn",
      "tooltip": "is_client",
      "tooltipRaw": "(is_client)",
      "source": "Obj 15"
    }
  ],
  "standards": [
    {
      "standard": "PDF/UA",
      "part": "1",
      "conformance": null,
      "source": "XMP"
    },
    {
      "standard": "PDF/A",
      "part": "3",
      "conformance": "A",
      "source": "XMP"
    }
  ]
}

The returned Report object provides structured access to validation results, making integration into existing workflows easy.

Conclusion

The upcoming TX Text Control 34.0 release will provide developers with powerful tools to create and validate PDF/UA- and PDF/A-3a-compliant documents directly within their .NET applications. The validation library streamlines the process of ensuring accessibility and compliance, enabling developers to confidently meet industry standards.

Stay in the loop!

Subscribe to the newsletter to receive the latest updates.

ASP.NET

Integrate document processing into your applications to create documents such as PDFs and MS Word documents, including client-side document editing, viewing, and electronic signatures.

ASP.NET Core
Angular
Blazor
JavaScript
React
  • Angular
  • Blazor
  • React
  • JavaScript
  • ASP.NET MVC, ASP.NET Core, and WebForms

Learn more Trial token Download trial

Related Posts

ActiveXASP.NETWindows Forms

Introducing TX Text Control 34.0: Your Next Leap in Document Processing.

We are happy to announce the release of TX Text Control 34.0. This version is packed with new features and enhancements that will elevate your document processing experience. This version…


ASP.NETASP.NET CorePDF/UA

PDF/UA vs. PDF/A-3a: Which Format Should You Use for Your Business Application?

In this blog post, we will explore the differences between PDF/UA and PDF/A-3a, helping you choose the right format for your business needs. We will discuss the key features, benefits, and use…


ASP.NETAccessibilityASP.NET Core

Upcoming Support for PDF/UA Compliance and Tagged PDF Generation in Version 34.0

We are happy to announce that version 34.0 will support PDF/UA compliance and the creation of tagged PDF documents. This significant update demonstrates our ongoing commitment to accessibility by…


ASP.NETWindows FormsWPF

TX Text Control 34.0 SP1 is Now Available: What's New in the Latest Version

TX Text Control 34.0 Service Pack 1 is now available, offering important updates and bug fixes for all platforms. If you use TX Text Control in your document processing applications, this service…


ASP.NETASP.NET CorePDF

Validate Digital Signatures and the Integrity of PDF Documents in C# .NET

Learn how to validate digital signatures and the integrity of PDF documents using the PDF Validation component from TX Text Control in C# .NET. Ensure the authenticity and compliance of your…

Summarize this blog post with:

Share on this blog post on: