Products Technologies Demo Docs Blog Support Company

Programmatically Convert MS Word DOCX Documents to PDF in .NET C#

This article shows how to convert MS Word DOCX documents to PDF in .NET C# using the ServerTextControl component. The example shows how to load a DOCX file from a file or from a variable and how to save it as an Adobe PDF file.

Programmatically Convert MS Word DOCX Documents to PDF in .NET C#

The conversion of MS Word documents in the Office Open XML DOCX, DOC or RTF formats to PDF is a very common task for a wide variety of scenarios. Whether you use DOCX documents as mail merge templates that are merged with data such as JSON to create invoices or other documents, they are ultimately converted to PDF for delivery to end users. Or, for archiving purposes, documents can be created as PDF/A, an ISO standard for secure, long-term archiving.

These are some of the typical scenarios in which DOCX files are converted to PDF:

  • Finalized documents that are ready for distribution and that should remain consistent across devices and platforms.
  • Legal documents that need to be secured and protected from unauthorized changes.
  • Documents that are to be archived for long-term storage.
  • Documents that are to be printed.

TX Text Control is a powerful document processing library for creating documents in a variety of formats and converting documents from supported formats to Adobe PDF and PDF/A. When creating the PDF document, it is also very important to be able to provide additional settings such as document access permissions, metadata or encryption. With TX Text Control, you have a tool that can easily create PDFs from MS Word documents and apply all the typical settings and legal requirements, including encryption and digital signatures.

TX Text Control provides a very powerful API to convert MS Word documents to PDF. The following tutorial shows how to convert a DOCX file to PDF using TX Text Control in a .NET 8 console application.

Preparing the Application

A .NET 8 console application is created for the purposes of this demo.

Prerequisites

The following tutorial requires a trial version of TX Text Control .NET Server.

  1. In Visual Studio, create a new Console App using .NET 8.

  2. In the Solution Explorer, select your created project and choose Manage NuGet Packages... from the Project main menu.

    Select Text Control Offline Packages from the Package source drop-down.

    Install the latest versions of the following package:

    • TXTextControl.TextControl.ASP.SDK

    Create PDF

Converting a DOCX File to PDF

After creating the console application and installing the required NuGet package, the following code shows how to convert a DOCX file to PDF using TX Text Control:

using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl())
{
    tx.Create();

    // Load the document from a file
    tx.Load("Documents/sample.docx", TXTextControl.StreamType.WordprocessingML);
    
    // Save the document as PDF
    tx.Save("Documents/result.pdf", TXTextControl.StreamType.AdobePDF);
}

To load a document from memory, the same method is used as to load a document from a file. The Load method is called with the byte array of the document:

using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl())
{
    tx.Create();

    byte[] document = File.ReadAllBytes("Documents/sample.docx");
    byte[] pdf = null;
    
    // Load the document from a file
    tx.Load(document, TXTextControl.BinaryStreamType.WordprocessingML);
    
    // Save the document as PDF
    tx.Save(out pdf, TXTextControl.BinaryStreamType.AdobePDF);
}

The PDF document is then created as a byte array using the Save method with the PDF format.

Applying PDF Settings

TX Text Control provides a very powerful API to apply additional settings to the PDF document such as document access permissions, metadata or encryption. Using the SaveSettings class, you can define two passwords: the UserPassword for opening the document and the MasterPassword for the document's access permissions. These permissions can be set using the DocumentAccessPermissions property.

These are the possible values:

Member Description
AllowAll After the document has been opened no further document access is restricted.
AllowAuthoring Allows comments to be added and interactive form fields (including signature fields) to be filled in.
AllowAuthoringFields Allows existing interactive form fields (including signature fields) to be filled in.
AllowContentAccessibility Allows content access for the visually impaired only.
AllowDocumentAssembly Allows the document to be assembled (insert, rotate or delete pages and create bookmarks or thumbnails).
AllowExtractContents Allows text and/or graphics to be extracted.
AllowGeneralEditing Allows the document contents to be modified.
AllowHighLevelPrinting Allows the document to be printed.
AllowLowLevelPrinting Allows the document to be printed (low-level).

The following code shows how to set the document access permissions to allow low level printing and content accessibility:

using TXTextControl;

using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl())
{
    tx.Create();

    // Load the document from a file
    tx.Load("Documents/sample.docx", TXTextControl.StreamType.WordprocessingML);

    SaveSettings saveSettings = new SaveSettings()
    {
        MasterPassword = "Master",
        UserPassword = "User",
        DocumentAccessPermissions = DocumentAccessPermissions.AllowLowLevelPrinting |
                                    DocumentAccessPermissions.AllowExtractContents
    };

    // Save the document as PDF
    tx.Save("Documents/result.pdf", TXTextControl.StreamType.AdobePDF, saveSettings);
}

Adding Digital Signatures

TX Text Control can be used to create Adobe PDF and PDF/A documents with digital signatures. These signatures can be created with PFX, DER Cer or Base64 CER certificate files. All you need is a valid certificate that is defined in the SaveSettings.

using System.Security.Cryptography.X509Certificates;
using TXTextControl;

using (ServerTextControl tx = new ServerTextControl())
{
    tx.Create();

    // Load the document from a file
    tx.Load("Documents/sample.docx", StreamType.WordprocessingML);

    X509Certificate2 cert = new X509Certificate2("test.pfx", "123");

    SaveSettings saveSettings = new SaveSettings(){
        DigitalSignature = new DigitalSignature(cert, null)
    };
    
    // Save the document as PDF
    tx.Save("Documents/result.pdf", StreamType.AdobePDF, saveSettings);
}

After setting the digital signature, the document is saved as a PDF file with the digital signature included.

Creating PDF/A

TX Text Control can be used to create PDF/A documents. PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for use in the archiving and long-term preservation of electronic documents. PDF/A differs from PDF by prohibiting features unsuitable for long-term archiving, such as font linking (instead of font embedding) and encryption.

Basically, creating PDF/A is very simple and all you need to do is select the PDF/A format when saving the document:

tx.Save("Documents/result.pdf", StreamType.AdobePDFA);

Like PDF, PDF/A is a platform-independent format that encapsulates all the information needed to render the document. For example, when creating PDF/A documents, only embeddable fonts are allowed because all fonts must be embedded. The following features are not permitted:

  • Audio and video content
  • JavaScript
  • Encryption
  • Embedded files (allowed in PDF/A-3 which is also supported by TX Text Control)
  • Transparency
  • Font linking

PDF/A requires all fonts to be part of the document. So what happens if a restricted font is used and the document is exported to PDF/A? TX Text Control automatically replaces the font with a similar font that is part of the PDF/A standard. This ensures that the document looks the same when opened on any device. However, this process can be controlled by actively replacing the font.

The following code looks for the unsupported font name and replaces it with Arial:

using TXTextControl;

using (ServerTextControl tx = new ServerTextControl())
{
    tx.Create();

    tx.FontSettings.EmbeddableFontsOnly = true;
    tx.FontSettings.AdaptFontEvent = false;
    tx.AdaptFont += Tx_AdaptFont;

    // Load the document from a file
    tx.Load("Documents/sample.docx", StreamType.WordprocessingML);

    // Save the document as PDF
    tx.Save("Documents/result.pdf", StreamType.AdobePDFA);
}
void Tx_AdaptFont(object sender, TXTextControl.AdaptFontEventArgs e)
{
    if (e.FontName == "Celtic Garamond the 2nd")
    {
        e.AdaptedFontName = "Arial";
    }
}

When the document is saved as PDF/A, the font is replaced with Arial and the document is saved as a PDF/A document.

Conclusion

When converting DOCX files to PDF, it is important to consider the requirements of the final document. TX Text Control provides a very powerful API to convert DOCX files to PDF and to apply additional settings such as document access permissions, metadata or encryption. The library can also be used to create PDF/A documents that are suitable for long-term archiving.

Stay in the loop!

Subscribe to the newsletter to receive the latest updates.

ASP.NET

Integrate document processing into your applications to create documents such as PDFs and MS Word documents, including client-side document editing, viewing, and electronic signatures.

ASP.NET Core
Angular
Blazor
JavaScript
React
  • Angular
  • Blazor
  • React
  • JavaScript
  • ASP.NET MVC, ASP.NET Core, and WebForms

Learn more Trial token Download trial

Related Posts

ASP.NETASP.NET CoreDOCX

Sign Documents with a Self-Signed Digital ID From Adobe Acrobat Reader in…

This article shows how to create a self-signed digital ID using Adobe Acrobat Reader and how to use it to sign documents in .NET C#. The article also shows how to create a PDF document with a…


ASP.NETASP.NET CoreExtraction

Mining PDFs with Regex in C#: Practical Patterns, Tips, and Ideas

Mining PDFs with Regex in C# can be a powerful technique for extracting information from documents. This article explores practical patterns, tips, and ideas for effectively using regular…


ASP.NETASP.NET CoreForms

Streamline Data Collection with Embedded Forms in C# .NET

Discover how to enhance your C# .NET applications by embedding forms for data collection. This article explores the benefits of using Text Control's ASP.NET and ASP.NET Core components to create…


ASP.NETASP.NET CorePDF

Adding QR Codes to PDF Documents in C# .NET

This article explains how to add QR codes to PDF documents with the Text Control .NET Server component in C#. It provides the necessary steps and code snippets for effectively implementing this…


ASP.NETASP.NET CorePDF

Adding SVG Graphics to PDF Documents in C# .NET

In this article, we will explore how to add SVG graphics to PDF documents using C# .NET. We will use the TX Text Control .NET Server component to demonstrate the process of rendering SVG images in…