Products Technologies Demo Docs Blog Support Company

Export Document Tables to CSV in .NET C#

This article shows how to use ServerTextControl to load documents, iterate through table rows and cells, and export document tables as CSV files. The sample handles multiple tables, header rows, custom separators, and nested table limitations.

Export Document Tables to CSV in .NET C#

Tables in business documents often contain valuable structured data: order lists, inventory snapshots, timesheets, reports, and summaries. But that data is not always stored in a database or spreadsheet. Sometimes it is embedded in HTML, DOCX, RTF, or PDF workflows where the document itself is the source of truth.

This sample shows the reverse direction of a typical CSV-to-document workflow. Instead of creating a document table from CSV data, it loads a document with ServerTextControl, iterates through its tables, reads the table cells, and exports the result as CSV files.

Table Extraction with ServerTextControl

The complete sample is a console application that uses TX Text Control .NET Server to load HTML sample documents and export all top-level document tables as CSV files.

When Table-to-CSV Export Is Useful

Table-to-CSV conversion is useful whenever document content needs to move into systems that expect flat tabular data. This happens often in business applications where generated documents contain tables that later need to be analyzed, imported, or processed by another system.

Typical examples include:

  • Exporting order tables from generated customer documents for import into ERP or reporting systems.
  • Extracting inventory or stock tables from operational reports.
  • Converting timesheet tables into CSV files for accounting or billing workflows.
  • Pulling regional sales summaries from monthly reports for further analysis in Excel, Power BI, or custom dashboards.
  • Creating lightweight data exchange files from document templates without requiring users to manually copy and paste table content.

CSV is intentionally simple. It works best for flat tables where each row contains the same kind of data. For this reason, the sample exports document tables only and ignores surrounding text such as headings, paragraphs, and captions.

Loading Documents with ServerTextControl

The sample uses ServerTextControl as a non-visual document processing component. The document is loaded from an HTML file, but the same pattern can be adapted to other supported document formats.

using var serverTextControl = new ServerTextControl();
serverTextControl.Create();

serverTextControl.Load(documentPath, StreamType.HTMLFormat);

var tables = serverTextControl.Tables.Cast<Table>().ToList();

After loading the document, the Tables collection gives access to all tables in the main text. Each Table contains row and cell collections that can be used to read the table structure.

Exporting Several Tables

A CSV file represents one flat table. It has no native concept of multiple independent tables. Because of that, the sample exports each top-level table into its own CSV file.

If a document contains one table, the file name is based on the document name:

inventory-with-notes.csv

If a document contains multiple tables, each table is written separately:

customer-orders-table-1.csv
customer-orders-table-2.csv

The exporter filters out nested tables and keeps only top-level tables:

var tables = serverTextControl.Tables.Cast<Table>().ToList();
var topLevelTables = tables.Where(table => table.OuterTable is null).ToList();

OuterTable is null for top-level tables. This makes it a straightforward way to avoid exporting nested tables as independent CSV files.

Respecting Header Rows

TX Text Control table rows can be marked as header rows using TableRow.IsHeader. These rows are repeated at the top of each page when a table spans multiple pages.

For CSV export, those rows should appear first. If no row is marked as a header row, the sample simply exports the rows in their document order, which means the physical first row becomes the first CSV row.

private static IEnumerable<TableRow> GetRowsInCsvOrder(Table table)
{
    var rows = table.Rows
        .Cast<TableRow>()
        .Where(row => IsExportableRow(table, row))
        .ToList();

    var headerRows = rows.Where(row => row.IsHeader).ToList();

    if (headerRows.Count == 0)
    {
        return rows;
    }

    return headerRows.Concat(rows.Where(row => !row.IsHeader));
}

This preserves one or more explicit table header rows without duplicating them.

Extracting Cell Text

The actual cell extraction loops through the table's cells, filters them by row number, orders them by column number, and reads the TableCell.Text value.

private static IEnumerable<TableCell> GetCells(Table table, TableRow row)
{
    return table.Cells
        .Cast<TableCell>()
        .Where(cell => cell.Row == row.Row)
        .OrderBy(cell => cell.Column);
}

The sample normalizes line endings and trims cell values before writing them to CSV:

private static string NormalizeCellText(string value)
{
    return value
        .Replace("\r\n", "\n", StringComparison.Ordinal)
        .Replace('\r', '\n')
        .Trim();
}

Handling Nested Tables

Nested tables do not map well to CSV. A CSV cell can contain text, but it cannot contain another grid with its own rows and columns in a standard way.

The sample handles nested tables deliberately:

  • Nested tables are skipped as separate CSV exports.
  • If a top-level table cell contains a nested table, that cell is exported as an empty value.
  • A warning is printed to the console.
private static bool ContainsNestedTable(TableCell cell, IReadOnlyList<Table> nestedTables)
{
    if (cell.Length <= 0 || nestedTables.Count == 0)
    {
        return false;
    }

    var cellStart = cell.Start;
    var cellEnd = cell.Start + cell.Length;

    return nestedTables
        .SelectMany(table => table.Cells.Cast<TableCell>())
        .Any(nestedCell => nestedCell.Start >= cellStart && nestedCell.Start <= cellEnd);
}

This avoids producing misleading CSV data. In production scenarios, another option would be to write nested tables to separate files and store metadata that links them back to the parent cell.

CSV Formatting and Separators

CSV output must escape values that contain quotes, line breaks, or the selected separator. The sample keeps this logic in a dedicated CsvFormatter class.

private string EscapeValue(string value)
{
    if (!value.Contains('"') &&
        !value.Contains(_separator) &&
        !value.Contains('\r') &&
        !value.Contains('\n'))
    {
        return value;
    }

    return $"\"{value.Replace("\"", "\"\"", StringComparison.Ordinal)}\"";
}

The separator can be selected from the command line. The default is a comma, but semicolon, tab, pipe, space, or any single character can be used.

dotnet run
dotnet run -- --separator semicolon
dotnet run -- --separator "|"

This is helpful for regional spreadsheet workflows where semicolon-delimited CSV files are preferred.

Running the Sample

Build the project:

dotnet build .\tx-table-to-csv.slnx

Run the console application from the project folder:

cd .\tx-table-to-csv
dotnet run

The sample HTML files are copied to the output directory during build. At runtime, the app reads them from the executable folder:

var sampleDirectory = Path.Combine(AppContext.BaseDirectory, "Samples");
var exportDirectory = Path.Combine(AppContext.BaseDirectory, "Exports");

CSV files are written to the Exports folder next to the executable.

Example output:

Using ',' as CSV separator.
Exported table 1 from customer-orders.html to customer-orders-table-1.csv.
Exported table 2 from customer-orders.html to customer-orders-table-2.csv.
Exported table 1 from nested-service-plan.html to nested-service-plan.csv.
  Warning: Skipped 1 nested table(s) as separate CSV export(s).
  Warning: Omitted nested-table content from 1 cell(s).

Conclusion

TX Text Control provides access to a document's table structure through the server-side object model. By loading the document with ServerTextControl, iterating through Table, TableRow, and TableCell objects, and applying CSV escaping rules, document tables can be exported into a simple data exchange format.

The important design decision is to keep the CSV output focused on table data only. Surrounding document text, captions, and nested tables are document structure, not flat CSV data. Handling those cases explicitly makes the export predictable and easier to consume in downstream systems.

You can download the complete sample project from our GitHub repository below.

Stay in the loop!

Subscribe to the newsletter to receive the latest updates.

GitHub

Download and Fork This Sample on GitHub

We proudly host our sample code on github.com/TextControl.

Please fork and contribute.

Download ZIP

Open on GitHub

Open in Visual Studio

Requirements for this sample

  • TX Text Control .NET Server 34.0
  • Visual Studio 2026

ASP.NET

Integrate document processing into your applications to create documents such as PDFs and MS Word documents, including client-side document editing, viewing, and electronic signatures.

ASP.NET Core
Angular
Blazor
JavaScript
React
  • Angular
  • Blazor
  • React
  • JavaScript
  • ASP.NET MVC, ASP.NET Core, and WebForms

Learn more Trial token Download trial

Related Posts

ASP.NETASP.NET CoreCSV

Convert CSV to PDF in .NET C#

Learn how to convert CSV data to a table in C# using the ServerTextControl library with this step-by-step tutorial. Easily generate PDF documents from CSV files in your .NET applications.


ASP.NETASP.NET CorePDF

Why Table Control in Templates is Important for Professional PDF Creation in C#

Controlling how tables behave at page breaks is an important factor in creating professional-looking documents. This article discusses the importance of table control in templates for PDF generation.


ASP.NETWindows FormsASP.NET Core

Splitting Tables at Bookmark Positions and Cloning Table Headers

This article shows how to split tables at bookmark positions and how to clone table headers in TX Text Control .NET for Windows Forms and TX Text Control .NET Server.


AngularASP.NETASP.NET Core

Creating Advanced Tables in PDF and DOCX Documents with C#

This article shows how to create advanced tables in PDF and DOCX documents using the TX Text Control .NET for ASP.NET Server component. This article shows how to create tables from scratch,…


ASP.NETWindows FormsWPF

Inserting MergeBlocks with the DataSourceManager and Applying Table Styles in C#

This article shows how to insert MergeBlocks with the DataSourceManager and how to apply table styles to those tables. The article uses the DocumentServer class to insert MergeBlocks with the…

Share on this blog post on: