Reputation: 51
I have a very large database with billions of words. I need to search inside these words, and the fastest way I know is using iFTS coming with SQL SERVER 2008.
The data is in Turkish. I mean the language of the data is Turkish. And SQL SERVER 2008 handles Full-Text searches with no problem, even in Turkish.
But the problem happens when I try to list the Full-Text words as described here: http://technet.microsoft.com/en-us/library/cc280900.aspx
The word columns returned from sys.dm_fts_index_keywords are keyword and display_term. But these columns are not in correct character set. For example there are both ı and i in Turkish charset. Similarly o and ö, g and ğ. But the words return are ascii encoded. Like kör is return as kor and için is returned as icin.
But when I do a CONTAINS search, SQL Server matches the search words correctly returns true results. I mean searches with kör and kor return different results, which is the true behavior.
So I need to get the words as they are stored in SQL, not their ascii representations.
I hope I could explain my problem.
Upvotes: 2
Views: 387
Reputation: 51
It seems this has been fixed in SQL 2012... In SQL 2012 the columns, keyword and display term returned by query sys.dm_fts_index_keywords; are now returning correct Turkish words...
Upvotes: 2