pierrotlefou
pierrotlefou

Reputation: 40761

search japanese characters (utf-8 encoded ) using Sqlite FTS

It seems that Sqlite FTS don't support searching Japanese characters according to my experiments , and discussion here.

#select * from tblEvent_shortdes where short_des MATCH   'BSジャパンの見どころ' 
#return nothing
select * from tblEvent_shortdes where short_des MATCH  'パンの見' 

Customize tokenizer in FTS seems to be the way to accomplish this but I did not found any promising open sourced tokenizer for Japanese. Will ICU tokenizer do?

Upvotes: 2

Views: 1289

Answers (1)

borrible
borrible

Reputation: 17376

You might take a look at ChaSen and MeCab. It has been several years since I used either - and it looks as though neither has been updated recently - but both proved adequate at Japanese tokenization.

Upvotes: 3

Related Questions