Products Technologies Demo Docs Blog Support Company

ReportingCloud: Extract Plain Text from Adobe PDF, MS Word or HTML Documents

The ReportingCloud Web API endpoints that return documents received a new supported return format that returns plain text.

ReportingCloud: Extract Plain Text from Adobe PDF, MS Word or HTML Documents

The endpoint document/convert can take a document such as an Adobe PDF, MS Word DOCX or HTML document in order to convert it into another format.

https://api.reporting.cloud/v1/document/convert

In addition, all endpoints that return documents can now return plain text (TXT) as well. This allows you to extract plain text from Adobe PDF documents or MS Word Office Open XML (DOCX) documents.

The request parameters of the document/convert method contain the returnFormat parameter that accepts the following formats: PDF, PDFA, RTF, DOC, DOCX, HTML, TXT and TX.

In order to extract plain text from an Adobe PDF document, you simply need to send a digitally born PDF document and to request the return document format as TXT (plain text).

The following code uses the ReportingCloud .NET Framework SDK (C#) to extract plain text from an Adobe PDF document:

TXTextControl.ReportingCloud.ReportingCloud rc = 
    new TXTextControl.ReportingCloud.ReportingCloud("yourAPIKey");

byte[] bResults = rc.ConvertDocument(
    File.ReadAllBytes("document.pdf"), 
    TXTextControl.ReportingCloud.ReturnFormat.TXT);

Console.WriteLine(Encoding.ASCII.GetString(bResults));

Test this is on your own and create a free ReportingCloud trial account or read the full documentation.

Stay in the loop!

Subscribe to the newsletter to receive the latest updates.

Reporting

The Text Control Reporting Framework combines powerful reporting features with an easy-to-use, MS Word compatible word processor. Users can create documents and templates using ordinary Microsoft Word skills. The Reporting Framework is included in all .NET based TX Text Control products including ASP.NET, Windows Forms and WPF.

See Reporting products

Related Posts

ReportingReportingCloudTrack Changes

New Endpoint: Manipulating Tracked Changes in Documents

We just rolled out a new endpoint to the ReportingCloud Web API to manipulate tracked changes in a document.


ReportingConferenceReportingCloud

Impressions from Developer Days Magdeburg 2019

This week, we exhibited at Developer Days Magdeburg 2019 and published some impressions of our booth area.


ReportingConferenceReportingCloud

See Text Control at DEVintersection, Orlando

We are exhibiting at DEVintersection in Orlando next month June 11-13, 2019 - one of the largest Microsoft focused conferences in North America.


ReportingConferenceReportingCloud

Impressions from dotnet Cologne 2019

We sponsored and exhibited at the German community conference dotnet Cologne 2019 and published some impressions of our booth area.


ReportingReportingCloud

ReportingCloud: New MergeSettings property to remove empty lines

We just rolled out a new MergeSettings property to remove empty lines during the merge process.