When opening a document in TX Text Control, it is required to provide the filename and path or a variable that holds the document, specific LoadSettings TX Text Control .NET Server for ASP.NET
TXTextControl Namespace
LoadSettings Class
The LoadSettings class provides properties for advanced settings and information during load operations.
and the StreamType TX Text Control .NET Server for ASP.NET
TXTextControl Namespace
TXTextControl Enumerations Enumerations
StreamType Enumeration Enumeration
Determines a certain text format.
that describes the document format. In other words: It is required to know which document format should be used to open the document.

Providing the StreamType

If you want to open a physical MS Word document that has been saved using the Office Open XML (DOCX) format, the following code is required:

textControl1.Load("document.docx", StreamType.WordProcessingML);
view raw load.cs hosted with ❤ by GitHub

Internally, this makes it a lot faster to provide the format of a document to load, so that Text Control already knows which format filter should be used.

Loading Strategies

If you want to automate this process, there are 3 different strategies:

  1. File extension
    You can use the file extension (e.g. *.docx) to draw the conclusion on the required StreamType.

    Problem: Some files have a wrong extension. For example, there are applications that save documents in the RTF format using a *.docx extension.

  2. Trial and error
    Another strategy is to try all possible StreamTypes and to use a try/catch statement around the Load TX Text Control .NET Server for ASP.NET
    TXTextControl Namespace
    ServerTextControl Class
    Load Method
    Loads text in a certain format.
    method that will throw a FilterException TX Text Control .NET Server for ASP.NET
    TXTextControl Namespace
    FilterException Class
    The FilterException class informs about errors which can occur when a text filter is used to convert a document to or from another format.
    until the document can be loaded successfully.

    Problem: It is a relatively slow process as all required filters must be explicitly loaded.

  3. Check format in advance
    In this strategy, the first 4 bytes are checked for a specific document header to "guess" the format. This is the fastest way to define a format and to load it successfully into Text Control.

Extension Method

The following extension method is a combination of strategy 2 and 3 to provide the best results:

public static class TextControlExtensions {
public static int Load(this ServerTextControl serverTextControl,
LoadSettings ls,
byte[] data,
int iterator = 0) {
try {
// check format based on first 4 bytes
if (iterator == 0) {
var test = data.Take(4);
foreach (var hintFormat in HintFormats) {
if (test.SequenceEqual(hintFormat.Value)) {
iterator = hintFormat.Key;
break;
}
}
}
switch (iterator) {
case 0:
serverTextControl.Load(data, BinaryStreamType.WordprocessingML, ls);
return 1024;
case 1:
serverTextControl.Load(data, BinaryStreamType.MSWord, ls);
return 64;
case 2:
serverTextControl.Load(data, BinaryStreamType.AdobePDF, ls);
return 512;
case 3:
serverTextControl.Load(Encoding.UTF8.GetString(data),
StringStreamType.RichTextFormat, ls);
return 8;
case 4:
serverTextControl.Load(Encoding.UTF8.GetString(data),
StringStreamType.HTMLFormat, ls);
return 4;
case 5:
serverTextControl.Load(data,
BinaryStreamType.InternalUnicodeFormat, ls);
return 32;
case 6:
serverTextControl.Load(data,
BinaryStreamType.SpreadsheetML, ls);
return 4096;
}
}
catch (MergeBlockConversionException) { }
catch {
if (iterator != 6) {
iterator++;
return Load(serverTextControl, ls, data, iterator);
}
}
return 0;
}
private static readonly Dictionary<int, byte[]> HintFormats =
new Dictionary<int, byte[]> {
[0] = new byte[] { 80, 75, 3, 4 }, // WordProcessingML
[1] = new byte[] { 208, 207, 17, 224 }, // MSWord
[2] = new byte[] { 37, 80, 68, 70 }, // AdobePDF
[3] = new byte[] { 123, 92, 114, 116 }, // RichTextFormat
[5] = new byte[] { 8, 7, 1, 0 } // InternalUnicodeFormat
};
}
view raw test.cs hosted with ❤ by GitHub

This extension method can be simply called passing a LoadSettings object and your document as a byte[] array:

textControl1.Load(ls, baDocument);
view raw test.cs hosted with ❤ by GitHub

If no iterator is provided as parameter, the first 4 bytes are checked of the given byte[] array by comparing them to the HintFormats dictionary. If a pattern is found, the iterator is set to the found pattern value. Then, the document is loaded in the switch/case statement.

If a specific iterator is provided, the method will start trying to load the document with that specific value. If the document cannot be loaded and the Load method throws an exception, the iterator is increased by 1 and the method is calling itself recursively with these new values.

This above extension method is a bullet-proof way to load any supported format directly simply by passing the document as a byte[] array.