PDF/A-3 allows the embedding of any file type into PDF documents. That allows the progression from electronic paper to an electronic container that holds the human and machine-readable versions of a document. Applications can extract the machine-readable portion of the PDF document in order to process it. A PDF/A-3 document can contain an unlimited number of embedded documents for different processes.

Learn more

PDF/A-3 permits the embedding of files of any format. This article gives an overview of the advantages of PDF/A-3 as an electronic container.

PDF/A-3: The Better Container for Electronic Documents

In this article you will learn how to embed a plain text file into a PDF document and how to extract the attachment from the PDF document.

Adding Attachments

This following sample code shows how TX Text Control can be used to attach the text file to a PDF document:

// create a non-UI ServerTextControl instance
using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl()) {
tx.Create();
// set dummy content
tx.Text = "PDF Document Content";
// read the content of the attachment
string sAttachment = System.IO.File.ReadAllText("attachment.txt");
// create the attachement
TXTextControl.EmbeddedFile attachment =
new TXTextControl.EmbeddedFile(
"attachment.txt",
sAttachment,
null) {
Description = "My Text File",
Relationship = "Unspecified",
MIMEType = "application/txt",
CreationDate = DateTime.Now,
};
// attached the embedded file
tx.DocumentSettings.EmbeddedFiles =
new TXTextControl.EmbeddedFile[] { attachment };
// save as PDF/A
tx.Save("document.pdf", TXTextControl.StreamType.AdobePDFA);
}
view raw tx.cs hosted with ❤ by GitHub

The EmbeddedFile TX Text Control .NET Server for ASP.NET
TXTextControl Namespace
EmbeddedFile Class
The EmbeddedFile class represents a file embedded in another document.
object represents the attachment. In the constructor, the file name, the data and additional meta data can be added. Additionally, the MIME type of the attachment (application/text in our case), a textual description, a relationship and the creation date is provided.

The relationship is an optional string describing the relationship of the embedded file and the containing document. It can be a predefined value or should follow the rules for second-class names (ISO 32000-1, Annex E). Predefined values are "Source", "Data", "Alternative", "Supplement" or "Unspecified".

When opening the document in Adobe Acrobat Reader, you will find the attachment in the Attachments side-panel.

PDF Attachments

Extracting Attachments

The following code is loading the created PDF file in order to find the attachment by looping through all embedded files. The found attachment is then extracted and exported as a text file.

// create a non-UI ServerTextControl instance
using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl()) {
tx.Create();
// load the PDF document
TXTextControl.LoadSettings ls = new TXTextControl.LoadSettings();
tx.Load("document.pdf", TXTextControl.StreamType.AdobePDF, ls);
// read the attachments
TXTextControl.EmbeddedFile[] files = ls.EmbeddedFiles;
// find the specific attachment and save it
foreach(TXTextControl.EmbeddedFile file in files) {
if (file.Description == "My Text File") {
string sAttachment = Encoding.UTF8.GetString((byte[])file.Data);
System.IO.File.WriteAllText("attachment_read.txt", sAttachment);
break;
}
}
}
view raw tx.cs hosted with ❤ by GitHub