Products Technologies Demo Docs Blog Support Company

Smart Documents: Embed Document Versions in PDF/A-3 Containers

TX Text Control can embed and extract embedded files into and from PDF documents. This can be used to create intelligent document containers consisting of the original document for editing.

Smart Documents: Embed Document Versions in PDF/A-3 Containers

PDF/A-3 allows files of any format to be embedded. PDF/A-3 documents enable the transition from electronic paper to an electronic container that contains both human- and machine-readable versions of a document. Applications can extract the machine-readable portion of the PDF document for processing. A PDF/A-3 document can contain an unlimited number of embedded documents for different processes.

Smart Document Container

This standard can also be used to store different versions of a document in the same container. This container is a PDF/A-3 document, where the cover document is always the latest version of the document. This human-readable version can be viewed in any PDF reader, and editable versions are attached to the document. When a new version is created, the editable version is attached and the cover document is replaced.

The following illustration shows the layers of a PDF/A-3 document with the PDF cover document (human readable) and additional attached editable documents. This illustration shows an additional annotation layer representing JSON data added by the TX Text Control Document Viewer.

PDF/A-3 Document

The advantage of this scenario is that the PDF can be sent to anyone outside your infrastructure, and the current version of the document is always visible to anyone with a simple Acrobat Reader. The viewable version is always the most current version.

Create the Container

The following code shows how to use the ServerTextControl class to create a new PDF document with an attached document in the internal TX Text Control format for editing. The EmbeddedFiles property is used to attach documents to the array of attached documents.

public string CreateNewDocument(byte[] document) {

  var DocumentName = Guid.NewGuid().ToString() + ".pdf";

  using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl()) {
    tx.Create();
    tx.Load(document, BinaryStreamType.InternalUnicodeFormat);

    byte[] dataTx;

    // save the blank document in the internal TX format
    tx.Save(out dataTx, TXTextControl.BinaryStreamType.InternalUnicodeFormat);

    // create an attachment
    EmbeddedFile embeddedFile = new EmbeddedFile("original.tx", dataTx, null);
    embeddedFile.Relationship = "Source";

    TXTextControl.SaveSettings saveSettings = new TXTextControl.SaveSettings() {
      EmbeddedFiles = new EmbeddedFile[] { embeddedFile }
    };

    // save a PDF with the attached Text Control document embedded
    tx.Save(DocumentName,
      TXTextControl.StreamType.AdobePDF,
      saveSettings);
  }

  return DocumentName;
}

Extract the Editable Document

The ExtractSmartDocument method loads a PDF/A-3 document and extracts the most recent original TX Text Control document from the attached files, which is returned as a byte array.

public byte[] ExtractSmartDocument(string DocumentName) {

  using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl()) {

    tx.Create();

    // the load PDF document
    TXTextControl.LoadSettings loadSettings = new LoadSettings();
    tx.Load(DocumentName,
      TXTextControl.StreamType.AdobePDF,
      loadSettings);

    // loop through all attachments to find the original document
    // and the annotations
    foreach (EmbeddedFile file in loadSettings.EmbeddedFiles.Reverse()) {

      if (file.FileName == "original.tx")
        return (byte[])file.Data;
    }
  }

  return null;
}

Update the Container

The UpdateDocument method loads the smart PDF document to add a new original document to the list of attachments. A timestamp is automatically stored, so the above method ExtractSmartDocument only needs to check for the last added document.

public string UpdateDocument(string DocumentName, byte[] document) {

  using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl()) {
    tx.Create();

    // the load PDF document
    TXTextControl.LoadSettings loadSettings = new LoadSettings();
    tx.Load(DocumentName,
      TXTextControl.StreamType.AdobePDF,
      loadSettings);

    List<EmbeddedFile> embeddedFiles = loadSettings.EmbeddedFiles.ToList();

    // create an attachment
    EmbeddedFile embeddedFile = new EmbeddedFile("original.tx", document, null);
    embeddedFile.Relationship = "Source";

    embeddedFiles.Add(embeddedFile);

    TXTextControl.SaveSettings saveSettings = new TXTextControl.SaveSettings() {
      EmbeddedFiles = embeddedFiles.ToArray()
    };

    tx.Load(document, BinaryStreamType.InternalUnicodeFormat);

    // save a PDF with the attached Text Control document embedded
    tx.Save(DocumentName,
      TXTextControl.StreamType.AdobePDF,
      saveSettings);
  }

  return DocumentName;
}

Stay in the loop!

Subscribe to the newsletter to receive the latest updates.

ASP.NET

Integrate document processing into your applications to create documents such as PDFs and MS Word documents, including client-side document editing, viewing, and electronic signatures.

ASP.NET Core
Angular
Blazor
JavaScript
React
  • Angular
  • Blazor
  • React
  • JavaScript
  • ASP.NET MVC, ASP.NET Core, and WebForms

Learn more Trial token Download trial

Related Posts

ASP.NETAIASP.NET Core

Automating PDF/UA Accessibility with AI: Describing DOCX Documents Using TX…

This article shows how to use TX Text Control together with the OpenAI API to automatically add descriptive texts (alt text and labels) to images, links, and tables in a DOCX. The resulting…


ASP.NETASP.NET CoreJava

Converting Office Open XML (DOCX) to PDF in Java

Learn how to convert Office Open XML (DOCX) documents to PDF in Java using the powerful ServerTextControl library. This guide provides step-by-step instructions and code examples to help you…


ASP.NETComparisonDocument Processing SDK

Document SDK Comparison: Complete Document Processing vs. PDF SDK

This blog outlines why complete document processing SDKs offer greater value for your investment compared to PDF SDKs. It also specifies the business factors and technical advantages that matter…


ASP.NETASP.NET CoreDS Server

Extending DS Server with Custom Digital Signature APIs

In this article, we will explore how to extend the functionality of DS Server by integrating custom digital signature APIs. We will cover the necessary steps to create a plugin that allows DS…


ASP.NETASP.NET CorePDF

Why PDF/UA and PDF/A-3a Matter: Accessibility, Archiving, and Legal Compliance

It is more important than ever to ensure that documents are accessible, archivable, and legally compliant. PDF/UA and PDF/A-3a are two effective standards for addressing these needs. This article…