Misha Moroshko
Misha Moroshko

Reputation: 171389

How to detect the language of a given text

In my Rails 3 application, users may write messages in forum. I would like to identify what the language is for a given message. I'm interested in English, Russian, and Hebrew languages. Is there any built-in library in Ruby/Rails for such a task? If not, any ideas will be appreciated.

Upvotes: 12

Views: 5826

Answers (8)

user3871311
user3871311

Reputation: 3

http://rubygems.org/gems/prose Prose dose it without a gem. Try it.

Upvotes: 0

Laurynas
Laurynas

Reputation: 3869

Language Detection API provides Ruby GEM to detect language.

Upvotes: 1

alexizydorczyk
alexizydorczyk

Reputation: 920

Just a quick demo of WhatLanguage for anyone interested : http://www.youtube.com/watch?v=lNqZ2cqOReo&list=UUJ_3fstMOH-g4yBxtvgAWkw&index=0&feature=plcp

Upvotes: 0

Vasiliy Ermolovich
Vasiliy Ermolovich

Reputation: 24617

Use this: https://github.com/nashby/wtf_lang

"ruby is so awesome!".lang # => "en"
"ruby is so awesome!".full_lang # => "ENGLISH"

Upvotes: 6

Caley Woods
Caley Woods

Reputation: 4737

Perhaps you could look at the whatlanguage gem?

Upvotes: 2

digitalWestie
digitalWestie

Reputation: 2797

Since you're concerned with languages with different character sets you could dig up the character codes that are predominantly in your strings. You could then see if they fall into the code sets that represent hebrew / cryllic characters.

Upvotes: 2

shajin
shajin

Reputation: 3264

Take a look at this blog
http://blog.kenweiner.com/2008/04/server-side-language-detection-with.html
This may be helpful

Upvotes: 1

Hartator
Hartator

Reputation: 5145

You can use the api provided by google to guess it with google translate.

See here for documentation : http://code.google.com/apis/language/translate/v1/using_rest_langdetect.html

Upvotes: 5

Related Questions