When opening a document in TX Text Control, it is required to provide the filename and path or a variable that holds the document, specific Load ╰ TX Text Control .NET Server for ASP.NET
╰ TXTextControl Namespace
╰ LoadSettings Class
The LoadSettings class provides properties for advanced settings and information during load operations. and the Stream ╰ TX Text Control .NET Server for ASP.NET
╰ TXTextControl Namespace
╰ TXTextControl Enumerations Enumerations
╰ StreamType Enumeration Enumeration
Determines a certain text format. that describes the document format. In other words: It is required to know which document format should be used to open the document.
Providing the StreamType
If you want to open a physical MS Word document that has been saved using the Office Open XML (DOCX) format, the following code is required:
textControl1.Load("document.docx", StreamType.WordProcessingML); |
Internally, this makes it a lot faster to provide the format of a document to load, so that Text Control already knows which format filter should be used.
Loading Strategies
If you want to automate this process, there are 3 different strategies:
-
File extension
You can use the file extension (e.g. *.docx) to draw the conclusion on the required StreamType.Problem: Some files have a wrong extension. For example, there are applications that save documents in the RTF format using a *.docx extension.
-
Trial and error
Another strategy is to try all possible StreamTypes and to use a try/catch statement around the Load╰ TX Text Control .NET Server for ASP.NET
╰ TXTextControl Namespace
╰ ServerTextControl Class
╰ Load Method
Loads text in a certain format. method that will throw a FilterException ╰ TX Text Control .NET Server for ASP.NET
╰ TXTextControl Namespace
╰ FilterException Class
The FilterException class informs about errors which can occur when a text filter is used to convert a document to or from another format. until the document can be loaded successfully.Problem: It is a relatively slow process as all required filters must be explicitly loaded.
-
Check format in advance
In this strategy, the first 4 bytes are checked for a specific document header to "guess" the format. This is the fastest way to define a format and to load it successfully into Text Control.
Extension Method
The following extension method is a combination of strategy 2 and 3 to provide the best results:
public static class TextControlExtensions { | |
public static int Load(this ServerTextControl serverTextControl, | |
LoadSettings ls, | |
byte[] data, | |
int iterator = 0) { | |
try { | |
// check format based on first 4 bytes | |
if (iterator == 0) { | |
var test = data.Take(4); | |
foreach (var hintFormat in HintFormats) { | |
if (test.SequenceEqual(hintFormat.Value)) { | |
iterator = hintFormat.Key; | |
break; | |
} | |
} | |
} | |
switch (iterator) { | |
case 0: | |
serverTextControl.Load(data, BinaryStreamType.WordprocessingML, ls); | |
return 1024; | |
case 1: | |
serverTextControl.Load(data, BinaryStreamType.MSWord, ls); | |
return 64; | |
case 2: | |
serverTextControl.Load(data, BinaryStreamType.AdobePDF, ls); | |
return 512; | |
case 3: | |
serverTextControl.Load(Encoding.UTF8.GetString(data), | |
StringStreamType.RichTextFormat, ls); | |
return 8; | |
case 4: | |
serverTextControl.Load(Encoding.UTF8.GetString(data), | |
StringStreamType.HTMLFormat, ls); | |
return 4; | |
case 5: | |
serverTextControl.Load(data, | |
BinaryStreamType.InternalUnicodeFormat, ls); | |
return 32; | |
case 6: | |
serverTextControl.Load(data, | |
BinaryStreamType.SpreadsheetML, ls); | |
return 4096; | |
} | |
} | |
catch (MergeBlockConversionException) { } | |
catch { | |
if (iterator != 6) { | |
iterator++; | |
return Load(serverTextControl, ls, data, iterator); | |
} | |
} | |
return 0; | |
} | |
private static readonly Dictionary<int, byte[]> HintFormats = | |
new Dictionary<int, byte[]> { | |
[0] = new byte[] { 80, 75, 3, 4 }, // WordProcessingML | |
[1] = new byte[] { 208, 207, 17, 224 }, // MSWord | |
[2] = new byte[] { 37, 80, 68, 70 }, // AdobePDF | |
[3] = new byte[] { 123, 92, 114, 116 }, // RichTextFormat | |
[5] = new byte[] { 8, 7, 1, 0 } // InternalUnicodeFormat | |
}; | |
} |
This extension method can be simply called passing a LoadSettings object and your document as a byte[] array:
textControl1.Load(ls, baDocument); |
If no iterator is provided as parameter, the first 4 bytes are checked of the given byte[] array by comparing them to the HintFormats dictionary. If a pattern is found, the iterator is set to the found pattern value. Then, the document is loaded in the switch/case statement.
If a specific iterator is provided, the method will start trying to load the document with that specific value. If the document cannot be loaded and the Load method throws an exception, the iterator is increased by 1 and the method is calling itself recursively with these new values.
This above extension method is a bullet-proof way to load any supported format directly simply by passing the document as a byte[] array.