Products Technologies Demo Docs Blog Support Company

TX Spell .NET: Exploring the Science Behind Suggestions

Recently, I had an interesting discussion with our TX Spell .NET team lead. I only wanted to know what he is currently working on. A question you should better not ask an enthusiastic developer, if you don't have enough time. But I took the time to learn more about creating suggestions for misspelled words. The big question is: How to create appropriate suggestions for a misspelled word in a language you may not be familiar with? We spend a lot of time researching such questions in order to…

TX Spell .NET: Exploring the Science Behind Suggestions
TX Spell .NET

Recently, I had an interesting discussion with our TX Spell .NET team lead. I only wanted to know what he is currently working on. A question you should better not ask an enthusiastic developer, if you don't have enough time. But I took the time to learn more about creating suggestions for misspelled words.

The big question is:

How to create appropriate suggestions for a misspelled word in a language you may not be familiar with?

We spend a lot of time researching such questions in order to provide you with the best components on the market. I won't disclose our secrets why TX Spell .NET is so fast and accurate, but I thought to share some basics to give you an idea of the complexity on this subject.

Two main steps are required to generate a list of appropriate suggestions:

  1. Transformation and Permutation
    In a first step, all possible transformations and permutations of the misspelled word must be created to a specific depth level. This is the most time consuming process. Characters must be removed, added, replaced or shifted. The performance of this algorithm is the key element in this process.

  2. Evaluation and Rating
    After all possible candidates have been created, they must be somehow weighted. This should increase the probability that the first suggestion is the word that the user wanted to type originally.

But how to rank such a candidate?

There are many factors that must be included in such algorithms. The obvious factor are phonetic replacements. Consider the following word:

ENOUF -> ENOUGH
F
should be replaced with it's phonetic opponent
GH
.

But this is just the most simple way to rate a suggestion. More complex considerations are required to build a high-potential replacement word. Another approach is to measure the distances between the keys on the currently used keyboard. Considering a US English keyboard, the probability of pressing the "S" key instead of the "A" is much higher than hitting the "L" which is on the other side of the keyboard. But at the same time, the algorithm must decide whether the pressed "L" was intended and the "A" was just missed. As you can see, this is a very complex order which took us a lot of time and efforts. But we faced the problem to weight the different changes in the suggestion.

If you want to build something exceptional, then do something exceptional.

Following this lead, our TX Spell .NET team analyzed internal chat protocols for misspelled words and typos. Chat histories are very useful, because we don't necessarily correct typos before sending the messages and we type fast when chatting. The analysis shows a varied picture of various factors.

This is just a very simple overview of the approaches to create appropriate suggestions. All of these results are or will be implemented in TX Spell .NET. You can focus on your core business - we do the word processing part.

Stay in the loop!

Subscribe to the newsletter to receive the latest updates.

Related Posts

Windows FormsWPFRelease

Service Pack 1 for TX Spell .NET 7.0 Released

We are very happy to announce the immediate availability of new Service Packs for all TX Spell .NET 7.0 products.


CloudReportingReportingCloud

Proofing Tools Available As ReportingCloud Web API Endpoints

We just rolled out 3 new ReportingCloud endpoints to integrate spell checking functionality to your cloud-based applications in more than fifty languages.


HyphenationServerTextControlSpell Checking

Using TX Spell .NET with ServerTextControl

TX Spell .NET comes with a core assembly and separate assemblies for the Win32 platforms Windows Forms and WPF. When using TX Spell .NET with the non-UI version of TX Text Control…


HTML5Spell Checking

HTML5: Enable Spell Checking Using Javascript

IMPORTANT: This feature is now part of the JavaScript API: Javascript: TXTextControl.isSpellCheckingEnabled property ? TX Text Control .NET Server ? JavaScript API ? TXTextControl Object ?…


HTML5Spell CheckingTutorial

Web.TextControl and Spell Checking

In combination with the spell checking component TX Spell .NET for Windows Forms (with ASP.NET support), spell checking can be easily added to web-based applications created with TX Text Control…