Products Technologies Demo Docs Blog Support Company

Different Ways to Store Additional Information in Documents

TX Text Control provides several ways to store additional information in documents including document properties, custom properties and complete embedded documents. This article gives an overview of the various options and typical applications.

Different Ways to Store Additional Information in Documents

Besides the actual content, a document can contain additional data including meta information about the document title, tags and author information, custom key-value pairs and complete embedded documents. TX Text Control provides access to all of these settings to read and write this additional information from and to documents.

Meta Information

Document properties, often known as meta data, are details about the document that describe or identify it. That can include information such as document title, author name, modification date, subject and specific keywords to identity or categorize the content.

In TX Text Control, meta data can be added using the SaveSettings object that is provided as a parameter in the Save method.

The following properties can be used:

Property Value Type Value description
Author String Sets the document's author which will be saved in the document.
CreationDate DateTime Sets the document's creation date which will be saved in the document.
CreatorApplication String Sets the application, which has created the document.
DocumentKeywords String[] Sets the document's keywords which will be saved in the document.
DocumentSubject String Sets the document's subject string which will be saved in the document.
DocumentTitle String Sets the document's title that will be saved in the document.
LastModificationDate DateTime Sets the date the document is last modified.

The following code shows how to use the SaveSettings to add meta data to an Office Open XML (DOCX) document:

TXTextControl.SaveSettings saveSettings =
    new TXTextControl.SaveSettings() {
        Author = "Susan Paul",
        CreationDate = DateTime.Now,
        CreatorApplication = "TX Text Control",
        DocumentKeywords = new string[] { "Summary", "Report" },
        DocumentSubject = "Detailed summary report 2022",
        DocumentTitle = "Sales Report 2022"
    };

textControl1.Save("output.docx", 
                  TXTextControl.StreamType.WordprocessingML, saveSettings);

When opening the Windows document properties dialog, the exported meta data can be seen in the property grid:

Document Properties

If the document is exported to PDF, the data can be found in the document properties dialog in Adobe Acrobat Reader:

Document Properties

Custom Properties

Custom properties can be created to store additional information about the document. These properties remain with a document and can be viewed by all MS Word users that open the document. Several property management servers provide data tracking capabilities to search for, sort, and track documents based on document properties.

In TX Text Control, these properties can be accessed and created using the UserDefinedPropertyDictionary class.

The UserDefinedPropertyDictionary class is used with the LoadSettings.UserDefinedDocumentProperties and SaveSettings.UserDefinedDocumentProperties properties. Each entry in the dictionary is a key/value pair, where the key is the name of the document property and the value is the document property's value.

The following code shows how to add new custom properties to an MS Word document:

TXTextControl.UserDefinedPropertyDictionary userSettings =
    new TXTextControl.UserDefinedPropertyDictionary();

userSettings.Add("CustomPropertyString", "Custom property string");
userSettings.Add("CustomPropertyInt32", 123);
userSettings.Add("CustomPropertyBoolTrue", true);
userSettings.Add("CustomPropertyBoolFalse", false);
userSettings.Add("CustomPropertyDouble", 123.232);

TXTextControl.SaveSettings saveSettings = new TXTextControl.SaveSettings() {
    UserDefinedDocumentProperties = userSettings
};

textControl1.Save("output.docx",
                  TXTextControl.StreamType.WordprocessingML,
                  saveSettings);

When opening the advanced properties dialog in MS Word, these key/value pairs are listed:

Document Properties

Embedding Documents

PDF/A-3 permits the embedding of files in any format. PDF/A-3 documents allow the progression from electronic paper to an electronic container that holds the human and machine-readable versions of a document. Applications can extract the machine-readable portion of the PDF document in order to process it. A PDF/A-3 document can contain an unlimited number of embedded documents for different processes.

Learn more

PDF/A-3 permits the embedding of files of any format. This article gives an overview of the advantages of PDF/A-3 as an electronic container.

PDF/A-3: The Better Container for Electronic Documents

This following sample code shows how TX Text Control can be used to attach a text file to a PDF document:

// create a non-UI ServerTextControl instance
using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl()) {

  tx.Create();
  // set dummy content
  tx.Text = "PDF Document Content";

  // read the content of the attachment
  string sAttachment = System.IO.File.ReadAllText("attachment.txt");

  // create the attachement
  TXTextControl.EmbeddedFile attachment =
     new TXTextControl.EmbeddedFile(
        "attachment.txt",
        sAttachment,
        null) {
       Description = "My Text File",
       Relationship = "Unspecified",
       MIMEType = "application/txt",
       CreationDate = DateTime.Now,
     };

  // attached the embedded file
  tx.DocumentSettings.EmbeddedFiles =
     new TXTextControl.EmbeddedFile[] { attachment };

  // save as PDF/A
  tx.Save("document.pdf", TXTextControl.StreamType.AdobePDFA);
}

The EmbeddedFile object represents the attachment that is embedded in the PDF document. Besides the content, the file name and additional meta data can be added. The MIME type of the attachment (application/text in our case), a textual description, a relationship and the creation date must be provided.

The relationship is an optional string describing the relationship of the embedded file and the containing document. It can be a predefined value or should follow the rules for second-class names (ISO 32000-1, Annex E). Predefined values are "Source", "Data", "Alternative", "Supplement" or "Unspecified".

When opening the document in Adobe Acrobat Reader, the attachment can be found in the Attachments side-panel.

PDF Attachments

Conclusion

A document is more than it's content. Additional meta data, custom properties and embedded documents help processes and workflows to find and categorize documents and to process documents automatically by embedding machine-readable content. TX Text Control provides the required functionality to create digital documents for a complete document automation process.

Stay in the loop!

Subscribe to the newsletter to receive the latest updates.

Related Posts

ASP.NETWindows FormsWPF

TX Text Control 33.0 SP3 is Now Available: What's New in the Latest Version

TX Text Control 33.0 Service Pack 3 is now available, offering important updates and bug fixes for all platforms. If you use TX Text Control in your document processing applications, this service…


ASP.NETWindows FormsWPF

TX Text Control 33.0 SP2 is Now Available: What's New in the Latest Version

TX Text Control 33.0 Service Pack 2 is now available, offering important updates and bug fixes for all platforms. If you use TX Text Control in your document processing applications, this service…


ASP.NETWindows FormsWPF

Document Lifecycle Optimization: Leveraging TX Text Control's Internal Format

Maintaining the integrity and functionality of documents throughout their lifecycle is paramount. TX Text Control provides a robust ecosystem that focuses on preserving documents in their internal…


ActiveXASP.NETWindows Forms

Expert Implementation Services for Legacy System Modernization

We are happy to officially announce our partnership with Quality Bytes, a specialized integration company with extensive experience in modernizing legacy systems with TX Text Control technologies.


ActiveXASP.NETWindows Forms

Service Pack Releases: What's New in TX Text Control 33.0 SP1 and 32.0 SP5

TX Text Control 33.0 Service Pack 1 and TX Text Control 32.0 Service Pack 5 have been released, providing important updates and bug fixes across platforms. These service packs improve the…