TX Spell .NET: Exploring the Science Behind Suggestions
Recently, I had an interesting discussion with our TX Spell .NET team lead. I only wanted to know what he is currently working on. A question you should better not ask an enthusiastic developer, if you don't have enough time. But I took the time to learn more about creating suggestions for misspelled words. The big question is: How to create appropriate suggestions for a misspelled word in a language you may not be familiar with? We spend a lot of time researching such questions in order to…


Recently, I had an interesting discussion with our TX Spell .NET team lead. I only wanted to know what he is currently working on. A question you should better not ask an enthusiastic developer, if you don't have enough time. But I took the time to learn more about creating suggestions for misspelled words.
The big question is:
How to create appropriate suggestions for a misspelled word in a language you may not be familiar with?
We spend a lot of time researching such questions in order to provide you with the best components on the market. I won't disclose our secrets why TX Spell .NET is so fast and accurate, but I thought to share some basics to give you an idea of the complexity on this subject.
Two main steps are required to generate a list of appropriate suggestions:
-
Transformation and Permutation
In a first step, all possible transformations and permutations of the misspelled word must be created to a specific depth level. This is the most time consuming process. Characters must be removed, added, replaced or shifted. The performance of this algorithm is the key element in this process. -
Evaluation and Rating
After all possible candidates have been created, they must be somehow weighted. This should increase the probability that the first suggestion is the word that the user wanted to type originally.
But how to rank such a candidate?
There are many factors that must be included in such algorithms. The obvious factor are phonetic replacements. Consider the following word:
ENOUF -> ENOUGH
Fshould be replaced with it's phonetic opponent
GH.
But this is just the most simple way to rate a suggestion. More complex considerations are required to build a high-potential replacement word. Another approach is to measure the distances between the keys on the currently used keyboard. Considering a US English keyboard, the probability of pressing the "S" key instead of the "A" is much higher than hitting the "L" which is on the other side of the keyboard. But at the same time, the algorithm must decide whether the pressed "L" was intended and the "A" was just missed. As you can see, this is a very complex order which took us a lot of time and efforts. But we faced the problem to weight the different changes in the suggestion.
If you want to build something exceptional, then do something exceptional.
Following this lead, our TX Spell .NET team analyzed internal chat protocols for misspelled words and typos. Chat histories are very useful, because we don't necessarily correct typos before sending the messages and we type fast when chatting. The analysis shows a varied picture of various factors.
This is just a very simple overview of the approaches to create appropriate suggestions. All of these results are or will be implemented in TX Spell .NET. You can focus on your core business - we do the word processing part.
Related Posts
Service Pack 1 for TX Spell .NET 7.0 Released
We are very happy to announce the immediate availability of new Service Packs for all TX Spell .NET 7.0 products.
Proofing Tools Available As ReportingCloud Web API Endpoints
We just rolled out 3 new ReportingCloud endpoints to integrate spell checking functionality to your cloud-based applications in more than fifty languages.
HyphenationServerTextControlSpell Checking
Using TX Spell .NET with ServerTextControl
TX Spell .NET comes with a core assembly and separate assemblies for the Win32 platforms Windows Forms and WPF. When using TX Spell .NET with the non-UI version of TX Text Control…
HTML5: Enable Spell Checking Using Javascript
IMPORTANT: This feature is now part of the JavaScript API: Javascript: TXTextControl.isSpellCheckingEnabled property ? TX Text Control .NET Server ? JavaScript API ? TXTextControl Object ?…
Web.TextControl and Spell Checking
In combination with the spell checking component TX Spell .NET for Windows Forms (with ASP.NET support), spell checking can be easily added to web-based applications created with TX Text Control…