Reputation: 3729
Using Natto gem (MeCab) is it possible to convert a mixed katakana/hiragana/kanji/alpha string to katakana/hiragana/alpha? (i.e. converts the kanji).
For example I need to convert this text:
日本語だぜ、これが。
これはカタカナである。
こいつはEnglish alphabet charsなのです。
ABC123てのは全角英数字です。
into this:
にほんごだぜ、これが。これはカタカナである。こいつはEnglishalphabetcharsなのです。ABC123てのはぜんかくえいすーじです。
Thanks!
Upvotes: 1
Views: 3452
Reputation: 3917
Author of the natto rubygem here. Thanks for using natto!
If I understand your question correctly, you would like to convert only the kanji characters into their corresponding hiragana (furigana) characters. The Ruby extension library NKF allows you to transform katakana into hiragana, and since MeCab yomi is by default returned as katakana, you could combine natto and NKF to convert the yomi readings for kanji only, leaving the other characters (hiragana, katakana, full- and/or half-width chars) as-is.
The key is to use natto to node-parse the input, and examine the char type value for each MeCab node. If the node's char type value is 2, that corresponds to a kanji node. You could then obtain the katakana yomi value from the MeCab node, and then use NKF to convert the katakana yomi into hiragana.
I just put up an example on the natto wiki.
Hope that helps!
Upvotes: 7