Vame
Vame

Reputation: 2063

Detect string/user input language

I wanted to know if there is a way to detect if the user's input is in greek charset.

Edit:
Just to be more clear, I need to recognize the language the user types and not the phone's locale.

For example, my phone is in English and let's say my keyboard is in Russian, the getDefault() returns "en", but I need to have "ru" at that point.


I do not know if this is available out of the box from android, maybe an approach to detect the string's character codes and see if is in English alphabet or in another. Any points on this?

I imagine something like
if character belongs to K then is English
(where K is the essemble of english characters)


Solution:

Finally I used regular expression to determine if the string is in English.

String pattern = "^[A-Za-z0-9. ]+$";
if (string.matches(pattern) 
   // is English
else
   // is not English

If someone has to propose a better solution I will mark it as answer.

Upvotes: 16

Views: 6851

Answers (3)

Pedantic
Pedantic

Reputation: 5022

As Siva suggests, you can check the user's locale.

In Android, this can be done by using Locale.getDefault(). Although I wouldn't strictly compare it to a 2-letter code, current Android implementation has it being a 2-letter language code, an underscore, and a two-letter country code. Ie., de_US would be German as spoken in the United States.

This is not the way the industry is moving, but its the best-supported pattern as of Java 6. Java 7, once supported by Android should support ISO 639 alpha-3 codes that are more future-proof.

Upvotes: 0

Andrey Starodubtsev
Andrey Starodubtsev

Reputation: 5332

You can use following method instead of pattern matching:

boolean isEnglish = true;
for ( char c : s.toCharArray() ) {
  if ( Character.UnicodeBlock.of(c) != Character.UnicodeBlock.BASIC_LATIN ) {
    isEnglish = false;
    break;
  }
}

Upvotes: 12

Siva Charan
Siva Charan

Reputation: 18064

Locale.getDefault().getLanguage().equals("gr")

In other way:

 contains(Charset) 

EDIT:

After some more time of browsing, I have come across CharsetDetector and Character Set Detection.

Here you have method detect() but am not sure how best this can be utilized.

Upvotes: 2

Related Questions