Attachments in PDF documents are useful when you need to include additional files (such as spreadsheets, images, or additional documents) along with the main content. They ensure that all relevant information is bundled, allowing readers to access related materials without leaving the document. Attachments are ideal for scenarios that require supporting evidence or references, such as technical reports, legal documents, or presentations. Another very useful use case is electronic invoicing in standard forms such as ZUGFeRD or XRechnung, where machine-readable data is attached to a human-readable invoice.

PDF/A-3: The Standard

PDF/A-3, part of the ISO 19005 archiving series, is the PDF standard that supports file attachments. PDF/A-3 allows any file format to be embedded as an attachment in a PDF document. PDF/A-3 is widely used in industries where long-term archiving and access to supplemental files are important, such as the financial and legal sectors.

PDF/A-3 documents can be created using document processing libraries such as TX Text Control, which provide a rich API for dynamically generating documents and attaching documents to the resulting PDF document. This article explains how to create a PDF/A-3 document with attachments using TX Text Control.

Extracting Attachments from a PDF Document

Consider a PDF document opened in Acrobat Reader that contains multiple attachments. The attachments are listed in the Attachments panel.

PDF/A-3 document with attachments

Getting Started

To get started with creating tables in documents using TX Text Control, you will need to have the TX Text Control .NET Server for ASP.NET component installed on your development machine. You can download a free trial version from the TX Text Control website and follow the installation instructions provided.

Prerequisites

The following tutorial requires a trial version of TX Text Control .NET Server for ASP.NET.

  1. In Visual Studio, create a new Console App using .NET 8.

  2. In the Solution Explorer, select your created project and choose Manage NuGet Packages... from the Project main menu.

    Select Text Control Offline Packages from the Package source drop-down.

    Install the latest versions of the following package:

    • TXTextControl.TextControl.ASP.SDK

    Create PDF

Extracting Attachments

The extraction of attachments using the TX Text Control is very easy. The attachments are accessible after loading a PDF document using the EmbeddedFiles TX Text Control .NET Server for ASP.NET
TXTextControl Namespace
LoadSaveSettingsBase Class
EmbeddedFiles Property
Specifies an array of EmbeddedFile objects which will be embedded in the saved document.
property, which returns an array of all embedded files including the document. Add the following code to the Program.cs file to load the document, loop through any attachments, and save them externally as files.

using TXTextControl;
using (ServerTextControl tx = new ServerTextControl())
{
tx.Create();
LoadSettings loadSettings = new LoadSettings();
tx.Load("acme_agreement.pdf", StreamType.AdobePDF, loadSettings);
foreach (var embeddedFile in loadSettings.EmbeddedFiles)
{
File.WriteAllBytes(embeddedFile.FileName, (byte[])embeddedFile.Data);
Console.WriteLine($"{embeddedFile.FileName} written.");
}
}
view raw test.cs hosted with ❤ by GitHub

After running this code, the attachments are extracted and saved as files in the specified output directory.

agreement.docx written.
data.json written.
data.xlsx written.
thumbnail.jpg written.

You will find the extracted files in the specified output directory.

PDF/A-3 document with attachments

Conclusion

PDF/A-3 is the standard for embedding attachments in PDF documents. TX Text Control provides an easy-to-use API for extracting attachments from PDF documents. This article explains how to extract attachments from a PDF document using TX Text Control in a .NET application.