Kareem Nour Emam
Kareem Nour Emam

Reputation: 1054

How to check if the word is Japanese or English?

I want to have a different process for English word and Japanese word in this method:

if (english) {
    // say english
} else {
    // say not english
}

How can I achieve this in JSP?

Upvotes: 5

Views: 6187

Answers (2)

BalusC
BalusC

Reputation: 1108742

Japanese characters lies within certain Unicode ranges:

  • U+3040–U+309F: Hiragana
  • U+30A0–U+30FF: Katakana
  • U+4E00–U+9FBF: Kanji

So all you basically need to do is to check if the character's codepoint lies within the known ranges.

Set<UnicodeBlock> japaneseUnicodeBlocks = new HashSet<UnicodeBlock>() {{
    add(UnicodeBlock.HIRAGANA);
    add(UnicodeBlock.KATAKANA);
    add(UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS);
}};

String mixed = "This is a Japanese newspaper headline: ラドクリフ、マラソン五輪代表に1万m出場にも含み";

for (char c : mixed.toCharArray()) {
    if (japaneseUnicodeBlocks.contains(UnicodeBlock.of(c))) {
        System.out.println(c + " is a Japanese character");
    } else {
        System.out.println(c + " is not a Japanese character");
    }
}

It's unclear when exactly you'd like to say Japanese back. When the string contains mixed Japanese and Latin (or other!) characters, or when the string contains only Japanese characters. The above example should at least be a good starting point.

Please note that this all is completely unrelated to JSP. JSP is just a web presentation technology which allows you to generate HTML/CSS/JS code dynamically. Writing Java code inside JSP files is considered a bad practice.

Upvotes: 14

JB Nizet
JB Nizet

Reputation: 691765

AFAIK, Japanese words use chars above 256, whereas English doesn't use them. You could test if one of the chars is >= 256 in the word.

Upvotes: 0

Related Questions