Reputation: 171389
In my Rails 3 application, users may write messages in forum. I would like to identify what the language is for a given message. I'm interested in English, Russian, and Hebrew languages. Is there any built-in library in Ruby/Rails for such a task? If not, any ideas will be appreciated.
Upvotes: 12
Views: 5826
Reputation: 3
http://rubygems.org/gems/prose Prose dose it without a gem. Try it.
Upvotes: 0
Reputation: 920
Just a quick demo of WhatLanguage for anyone interested : http://www.youtube.com/watch?v=lNqZ2cqOReo&list=UUJ_3fstMOH-g4yBxtvgAWkw&index=0&feature=plcp
Upvotes: 0
Reputation: 24617
Use this: https://github.com/nashby/wtf_lang
"ruby is so awesome!".lang # => "en"
"ruby is so awesome!".full_lang # => "ENGLISH"
Upvotes: 6
Reputation: 2797
Since you're concerned with languages with different character sets you could dig up the character codes that are predominantly in your strings. You could then see if they fall into the code sets that represent hebrew / cryllic characters.
Upvotes: 2
Reputation: 3264
Take a look at this blog
http://blog.kenweiner.com/2008/04/server-side-language-detection-with.html
This may be helpful
Upvotes: 1
Reputation: 5145
You can use the api provided by google to guess it with google translate.
See here for documentation : http://code.google.com/apis/language/translate/v1/using_rest_langdetect.html
Upvotes: 5