# Converting PDF Documents to Plain Text TXT in C#

> The extraction of plain text from PDF documents for further processing, analysis or full text search is a common task. This article explains how to extract plain text from PDF documents programmatically with .NET C# and TX Text Control.

- **Author:** Bjoern Meyer
- **Published:** 2023-11-16
- **Modified:** 2025-11-16
- **Description:** The extraction of plain text from PDF documents for further processing, analysis or full text search is a common task. This article explains how to extract plain text from PDF documents programmatically with .NET C# and TX Text Control.
- **2 min read** (342 words)
- **Tags:**
  - ASP.NET
  - PDF
  - Plain Text
  - Conversion
- **LLMs.txt URL:** https://www.textcontrol.com/blog/2023/11/16/converting-pdf-documents-to-plain-text-txt-in-csharp/llms.txt
- **LLMs-full.txt URL:** https://www.textcontrol.com/blog/2023/11/16/converting-pdf-documents-to-plain-text-txt-in-csharp/llms-full.txt
- **Canonical URL:** https://www.textcontrol.com/blog/2023/11/16/converting-pdf-documents-to-plain-text-txt-in-csharp/

---

Extracting plain text from PDF documents for further processing, analyzing or searching is a common task. Typically, a PDF document contains a collection of characters at specific locations, and a filter is required to import the text to be extracted.

With TX Text Control, all typical word processing formats such as DOC, DOCX, RTF and PDF can be loaded for plain text extraction. The following code shows how to create a simple console application that loads a PDF document and then extracts the plain text from it.

### Preparing the Application

A .NET 6 console application is created for the purposes of this demo.

> ### Prerequisites
> 
>  The following tutorial requires a trial version of TX Text Control .NET Server.
> 
> - [Download Trial Version](https://www.textcontrol.com/product/tx-text-control-dotnet-server/download/)

1. In Visual Studio, create a new *Console App* using .NET 6.
2. In the *Solution Explorer*, select your created project and choose *Manage NuGet Packages...* from the *Project* main menu.
    
    Select *Text Control Offline Packages* from the *Package source* drop-down.
    
    Install the latest versions of the following package:
    
    
    - TXTextControl.TextControl.ASP.SDK
    
    ![Create PDF](https://s1-www.textcontrol.com/assets/dist/blog/2023/11/16/b/assets/step1.webp "Create PDF")

### Adding a PDF

3. Create a folder named *App\_Data* in the root of your project. Copy your PDF documents into this folder. In this example, the name of the PDF document that will be loaded is *sample.pdf*.

### Adding the Code

4. Open the *Program.cs* file and add the following code:
    
    ```
    using (TXTextControl.ServerTextControl tx = new TXTextControl.ServerTextControl())
    {
    tx.Create();
    
    TXTextControl.LoadSettings ls = new TXTextControl.LoadSettings()
    {
    PDFImportSettings = TXTextControl.PDFImportSettings.GenerateParagraphs
    };
    
    // load PDF document
    tx.Load("App_Data/sample.pdf", TXTextControl.StreamType.AdobePDF, ls);
    
    // retrieve plain text
    var text = tx.Text;
    
    Console.WriteLine(text);
    }
    ```

Alternatively, if the document is stored in a database or by some other method, you can load the document from a byte array.

```
byte[] document = File.ReadAllBytes("App_Data/sample.pdf");
tx.Load(document, TXTextControl.BinaryStreamType.AdobePDF, ls);
```

TX Text Control recognizes matching words to create paragraphs and returns plain text written to the console.

---

## About Bjoern Meyer

As CEO, Bjoern is the visionary behind our strategic direction and business development, bridging the gap between our customers and engineering teams. His deep passion for coding and web technologies drives the creation of innovative products. If you're at a tech conference, be sure to stop by our booth - you'll most likely meet Bjoern in person. With an advanced graduate degree (Dipl. Inf.) in Computer Science, specializing in AI, from the University of Bremen, Bjoern brings significant expertise to his role. In his spare time, Bjoern enjoys running, paragliding, mountain biking, and playing the piano.

- [LinkedIn](https://www.linkedin.com/in/bjoernmeyer/)
- [X](https://x.com/txbjoern)
- [GitHub](https://github.com/bjoerntx)

---

## Related Posts

- [PDF Conversion in .NET: Convert DOCX, HTML and more with C#](https://www.textcontrol.com/blog/2025/08/05/pdf-conversion-in-dotnet-convert-docx-html-and-more-with-csharp/llms.txt)
- [Why Structured E-Invoices Still Need Tamper Protection using C# and .NET](https://www.textcontrol.com/blog/2026/03/24/why-structured-e-invoices-still-need-tamper-protection-using-csharp-and-dotnet/llms.txt)
- [Create Fillable PDFs from HTML Forms in C# ASP.NET Core Using a WYSIWYG Template](https://www.textcontrol.com/blog/2026/03/17/create-fillable-pdfs-from-html-forms-in-csharp-aspnet-core-using-a-wysiwyg-template/llms.txt)
- [Why HTML to PDF Conversion is Often the Wrong Choice for Business Documents in C# .NET](https://www.textcontrol.com/blog/2026/03/13/why-html-to-pdf-conversion-is-often-the-wrong-choice-for-business-documents-in-csharp-dot-net/llms.txt)
- [A Complete Guide to Converting Markdown to PDF in .NET C#](https://www.textcontrol.com/blog/2026/01/07/a-complete-guide-to-converting-markdown-to-pdf-in-dotnet-csharp/llms.txt)
- [Why PDF Creation Belongs at the End of the Business Process](https://www.textcontrol.com/blog/2026/01/02/why-pdf-creation-belongs-at-the-end-of-the-business-process/llms.txt)
- [Designing the Perfect PDF Form with TX Text Control in .NET C#](https://www.textcontrol.com/blog/2025/12/16/designing-the-perfect-pdf-form-with-tx-text-control-in-dotnet-csharp/llms.txt)
- [Why Defining MIME Types for PDF/A Attachments Is Essential](https://www.textcontrol.com/blog/2025/12/10/why-defining-mime-types-for-pdfa-attachments-is-essential/llms.txt)
- [Validate Digital Signatures and the Integrity of PDF Documents in C# .NET](https://www.textcontrol.com/blog/2025/11/14/validate-digital-signatures-and-the-integrity-of-pdf-documents-in-csharp-dotnet/llms.txt)
- [Validate PDF/UA Documents and Verify Electronic Signatures in C# .NET](https://www.textcontrol.com/blog/2025/11/13/validate-pdf-ua-documents-and-verify-electronic-signatures-in-csharp-dotnet/llms.txt)
- [How To Choose the Right C# PDF Generation Library: Developer Checklist](https://www.textcontrol.com/blog/2025/11/12/how-to-choose-the-right-csharp-pdf-generation-library-developer-checklist/llms.txt)
- [Why Digitally Signing your PDFs is the Only Reliable Way to Prevent Tampering](https://www.textcontrol.com/blog/2025/10/30/why-digitally-signing-your-pdfs-is-the-only-reliable-way-to-prevent-tampering/llms.txt)
- [Automating PDF/UA Accessibility with AI: Describing DOCX Documents Using TX Text Control and LLMs](https://www.textcontrol.com/blog/2025/10/16/automating-pdf-ua-accessibility-with-ai-describing-docx-documents-using-tx-text-control-and-llms/llms.txt)
- [Converting Office Open XML (DOCX) to PDF in Java](https://www.textcontrol.com/blog/2025/10/14/converting-office-open-xml-docx-to-pdf-in-java/llms.txt)
- [Document SDK Comparison: Complete Document Processing vs. PDF SDK](https://www.textcontrol.com/blog/2025/10/10/document-sdk-comparison-complete-document-processing-vs-pdf-sdk/llms.txt)
- [Extending DS Server with Custom Digital Signature APIs](https://www.textcontrol.com/blog/2025/10/09/extending-ds-server-with-custom-digital-signature-apis/llms.txt)
- [Why PDF/UA and PDF/A-3a Matter: Accessibility, Archiving, and Legal Compliance](https://www.textcontrol.com/blog/2025/10/07/why-pdf-ua-and-pdf-a-3a-matter-accessibility-archiving-and-legal-compliance/llms.txt)
- [Convert Markdown to PDF in a Console Application on Linux and Windows](https://www.textcontrol.com/blog/2025/09/23/convert-markdown-to-pdf-in-a-console-application-on-linux-and-windows/llms.txt)
- [Mining PDFs with Regex in C#: Practical Patterns, Tips, and Ideas](https://www.textcontrol.com/blog/2025/08/12/mining-pdfs-with-regex-in-csharp-practical-patterns-tips-and-ideas/llms.txt)
- [Streamline Data Collection with Embedded Forms in C# .NET](https://www.textcontrol.com/blog/2025/08/02/streamline-data-collection-with-embedded-forms-in-csharp-dotnet/llms.txt)
- [Adding QR Codes to PDF Documents in C# .NET](https://www.textcontrol.com/blog/2025/07/15/adding-qr-codes-to-pdf-documents-in-csharp-dotnet/llms.txt)
- [Adding SVG Graphics to PDF Documents in C# .NET](https://www.textcontrol.com/blog/2025/07/08/adding-svg-graphics-to-pdf-documents-in-csharp-dotnet/llms.txt)
- [Enhancing PDF Searchability in Large Repositories by Adding and Reading Keywords Using C# .NET](https://www.textcontrol.com/blog/2025/06/24/enhancing-pdf-searchability-in-large-repositories-by-adding-and-reading-keywords-using-csharp-dotnet/llms.txt)
- [How to Verify PDF Encryption Programmatically in C# .NET](https://www.textcontrol.com/blog/2025/06/20/how-to-verify-pdf-encryption-programmatically-in-csharp-dotnet/llms.txt)
- [PDF Security for C# Developers: Encryption and Permissions in .NET](https://www.textcontrol.com/blog/2025/06/16/pdf-security-for-csharp-developers-encryption-and-permissions-in-dotnet/llms.txt)
