The conversion of MS Word documents in the Office Open XML DOCX, DOC or RTF formats to PDF is a very common task for a wide variety of scenarios. Whether you use DOCX documents as mail merge templates that are merged with data such as JSON to create invoices or other documents, they are ultimately converted to PDF for delivery to end users. Or, for archiving purposes, documents can be created as PDF/A, an ISO standard for secure, long-term archiving.

These are some of the typical scenarios in which DOCX files are converted to PDF:

  • Finalized documents that are ready for distribution and that should remain consistent across devices and platforms.
  • Legal documents that need to be secured and protected from unauthorized changes.
  • Documents that are to be archived for long-term storage.
  • Documents that are to be printed.

TX Text Control is a powerful document processing library for creating documents in a variety of formats and converting documents from supported formats to Adobe PDF and PDF/A. When creating the PDF document, it is also very important to be able to provide additional settings such as document access permissions, metadata or encryption. With TX Text Control, you have a tool that can easily create PDFs from MS Word documents and apply all the typical settings and legal requirements, including encryption and digital signatures.

TX Text Control provides a very powerful API to convert MS Word documents to PDF. The following tutorial shows how to convert a DOCX file to PDF using TX Text Control in a .NET 8 console application.

Preparing the Application

A .NET 8 console application is created for the purposes of this demo.

Prerequisites

The following tutorial requires a trial version of TX Text Control .NET Server for ASP.NET.

  1. In Visual Studio, create a new Console App using .NET 8.

  2. In the Solution Explorer, select your created project and choose Manage NuGet Packages... from the Project main menu.

    Select Text Control Offline Packages from the Package source drop-down.

    Install the latest versions of the following package:

    • TXTextControl.TextControl.ASP.SDK

    Create PDF

Converting a DOCX File to PDF

After creating the console application and installing the required NuGet package, the following code shows how to convert a DOCX file to PDF using TX Text Control:

using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl())
{
tx.Create();
// Load the document from a file
tx.Load("Documents/sample.docx", TXTextControl.StreamType.WordprocessingML);
// Save the document as PDF
tx.Save("Documents/result.pdf", TXTextControl.StreamType.AdobePDF);
}
view raw test.cs hosted with ❤ by GitHub

To load a document from memory, the same method is used as to load a document from a file. The Load method is called with the byte array of the document:

using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl())
{
tx.Create();
byte[] document = File.ReadAllBytes("Documents/sample.docx");
byte[] pdf = null;
// Load the document from a file
tx.Load(document, TXTextControl.BinaryStreamType.WordprocessingML);
// Save the document as PDF
tx.Save(out pdf, TXTextControl.BinaryStreamType.AdobePDF);
}
view raw test.cs hosted with ❤ by GitHub

The PDF document is then created as a byte array using the Save method with the PDF format.

Applying PDF Settings

TX Text Control provides a very powerful API to apply additional settings to the PDF document such as document access permissions, metadata or encryption. Using the SaveSettings TX Text Control .NET Server for ASP.NET
TXTextControl Namespace
SaveSettings Class
The SaveSettings class provides properties for advanced settings and information during save operations.
class, you can define two passwords: the UserPassword TX Text Control .NET Server for ASP.NET
TXTextControl Namespace
LoadSaveSettingsBase Class
UserPassword Property
Specifies the password for the user when the document is reopened.
for opening the document and the MasterPassword TX Text Control .NET Server for ASP.NET
TXTextControl Namespace
LoadSaveSettingsBase Class
MasterPassword Property
Specifies a password for the document's access permissions.
for the document's access permissions. These permissions can be set using the DocumentAccessPermissions TX Text Control .NET Server for ASP.NET
TXTextControl Namespace
LoadSettings Class
DocumentAccessPermissions Property
Specifies how a document can be accessed after it has been loaded.
property.

These are the possible values:

Member Description
AllowAll After the document has been opened no further document access is restricted.
AllowAuthoring Allows comments to be added and interactive form fields (including signature fields) to be filled in.
AllowAuthoringFields Allows existing interactive form fields (including signature fields) to be filled in.
AllowContentAccessibility Allows content access for the visually impaired only.
AllowDocumentAssembly Allows the document to be assembled (insert, rotate or delete pages and create bookmarks or thumbnails).
AllowExtractContents Allows text and/or graphics to be extracted.
AllowGeneralEditing Allows the document contents to be modified.
AllowHighLevelPrinting Allows the document to be printed.
AllowLowLevelPrinting Allows the document to be printed (low-level).

The following code shows how to set the document access permissions to allow low level printing and content accessibility:

using TXTextControl;
using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl())
{
tx.Create();
// Load the document from a file
tx.Load("Documents/sample.docx", TXTextControl.StreamType.WordprocessingML);
SaveSettings saveSettings = new SaveSettings()
{
MasterPassword = "Master",
UserPassword = "User",
DocumentAccessPermissions = DocumentAccessPermissions.AllowLowLevelPrinting |
DocumentAccessPermissions.AllowExtractContents
};
// Save the document as PDF
tx.Save("Documents/result.pdf", TXTextControl.StreamType.AdobePDF, saveSettings);
}
view raw test.cs hosted with ❤ by GitHub

Adding Digital Signatures

TX Text Control can be used to create Adobe PDF and PDF/A documents with digital signatures. These signatures can be created with PFX, DER Cer or Base64 CER certificate files. All you need is a valid certificate that is defined in the SaveSettings TX Text Control .NET Server for ASP.NET
TXTextControl Namespace
SaveSettings Class
The SaveSettings class provides properties for advanced settings and information during save operations.
.

using System.Security.Cryptography.X509Certificates;
using TXTextControl;
using (ServerTextControl tx = new ServerTextControl())
{
tx.Create();
// Load the document from a file
tx.Load("Documents/sample.docx", StreamType.WordprocessingML);
X509Certificate2 cert = new X509Certificate2("test.pfx", "123");
SaveSettings saveSettings = new SaveSettings(){
DigitalSignature = new DigitalSignature(cert, null)
};
// Save the document as PDF
tx.Save("Documents/result.pdf", StreamType.AdobePDF, saveSettings);
}
view raw test.cs hosted with ❤ by GitHub

After setting the digital signature, the document is saved as a PDF file with the digital signature included.

Creating PDF/A

TX Text Control can be used to create PDF/A documents. PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for use in the archiving and long-term preservation of electronic documents. PDF/A differs from PDF by prohibiting features unsuitable for long-term archiving, such as font linking (instead of font embedding) and encryption.

Basically, creating PDF/A is very simple and all you need to do is select the PDF/A format when saving the document:

tx.Save("Documents/result.pdf", StreamType.AdobePDFA);
view raw test.cs hosted with ❤ by GitHub

Like PDF, PDF/A is a platform-independent format that encapsulates all the information needed to render the document. For example, when creating PDF/A documents, only embeddable fonts are allowed because all fonts must be embedded. The following features are not permitted:

  • Audio and video content
  • JavaScript
  • Encryption
  • Embedded files (allowed in PDF/A-3 which is also supported by TX Text Control)
  • Transparency
  • Font linking

PDF/A requires all fonts to be part of the document. So what happens if a restricted font is used and the document is exported to PDF/A? TX Text Control automatically replaces the font with a similar font that is part of the PDF/A standard. This ensures that the document looks the same when opened on any device. However, this process can be controlled by actively replacing the font.

The following code looks for the unsupported font name and replaces it with Arial:

using TXTextControl;
using (ServerTextControl tx = new ServerTextControl())
{
tx.Create();
tx.FontSettings.EmbeddableFontsOnly = true;
tx.FontSettings.AdaptFontEvent = false;
tx.AdaptFont += Tx_AdaptFont;
// Load the document from a file
tx.Load("Documents/sample.docx", StreamType.WordprocessingML);
// Save the document as PDF
tx.Save("Documents/result.pdf", StreamType.AdobePDFA);
}
void Tx_AdaptFont(object sender, TXTextControl.AdaptFontEventArgs e)
{
if (e.FontName == "Celtic Garamond the 2nd")
{
e.AdaptedFontName = "Arial";
}
}
view raw test.cs hosted with ❤ by GitHub

When the document is saved as PDF/A, the font is replaced with Arial and the document is saved as a PDF/A document.

Conclusion

When converting DOCX files to PDF, it is important to consider the requirements of the final document. TX Text Control provides a very powerful API to convert DOCX files to PDF and to apply additional settings such as document access permissions, metadata or encryption. The library can also be used to create PDF/A documents that are suitable for long-term archiving.