Reputation: 23
i have made a spell checker code using Ternary search tree. Can anybody tell me how to find the next possible word in TST. for example if i want to search if i search a word "Manly" in spell checker and the word is not present in TST, so the output it gives like DO YOU MEAN: "Man" "Mango" . .means possible near words
Upvotes: 1
Views: 757
Reputation: 19
I have implemented my own spell checker but instead of a simple ternary trie I am using a ternary dag as Peter Kankowski suggests here. You can look at my blog for some details and how I did it. Its in Greek, but you can get an idea.
Edit:
Ok, you are right. The basic idea is to use a pre-created list of candidates for a given edit distance ( a value of 2 for me is ok ). To reduce the size of the list one can use wildcard characters. Such a list, of course, can be constructed in different ways. I prefer for / while loops like this ( e.g. for candidates of two substitutions )
void Substitute2( vector<wchar_t*>& v, const wstring& w )
{
size_t len = w.size();
if ( len < 2 )
return;
size_t p1 = 0, p2 = 1;
while ( p1 < len ) {
p2 = p1 + 1;
while ( p2 < len ) {
wchar_t* chars = new wchar_t[ len + 1 ];
for ( size_t i = 0; i < len; ++i ) {
if ( i != p1 && i != p2 ) {
chars[ i ] = w[ i ];
}
}
chars[ p1 ] = '?';
chars[ p2 ] = '?';
chars[ len ] = '\0';
v.push_back( chars );
p2++;
}
p1++;
}
}
After having prepared the candidates list, a simple search in a ternary dag for each item in the list will give us the suggestions for that misspelled word.
void Search( FileNode* pDict, FileNode* pNode, const wchar_t* Word, wstring Sug, set<wstring>& List )
{
if ( IsNullLink( pNode, pDict ) )
return;
if ( *Word == '?' ) {
Search( pDict, GetLo( pNode, pDict ), Word, Sug, List );
Search( pDict, GetEq( pNode, pDict ), Word + 1, Sug + pNode->Char, List );
Search( pDict, GetHi( pNode, pDict ), Word, Sug, List );
} else {
if ( *Word < pNode->Char ) {
Search( pDict, GetLo( pNode, pDict ), Word, Sug, List );
} else if ( *Word > pNode->Char ) {
Search( pDict, GetHi( pNode, pDict ), Word, Sug, List );
} else {
if ( pNode->Char == '\0' )
{
List.insert( Sug );
}
if ( *Word != '\0' ) {
Search( pDict, GetEq( pNode, pDict ), Word + 1, Sug + pNode->Char, List );
}
}
}
}
Note: The dictionary is a compiled (file based) ternary dag
Upvotes: 1
Reputation: 979
The search for the word in your TST will terminate at a specific location in the tree. From this point, you can simply go up one level in the tree until you hit a level where there is more than just the child you came from.
On that level, you can simply choose the other possible paths and return those words.
Upvotes: 0