Reputation: 1053
suggest the following code:
package org.apache.creadur.rat2.core.ds;
import java.util.Locale;
public class TestUsAsciiLocale {
public static void main(String[] pArgs) throws Exception {
final String capitalLetterAe = "\u00c4";
final String smallLetterAe = "\u00e4";
if (capitalLetterAe.toLowerCase(Locale.GERMANY).equals(smallLetterAe)) {
System.out.println("Capital Ae, and small ae are the same (case insensitive) in the german Locale.");
}
if (capitalLetterAe.toLowerCase(Locale.US).equals(smallLetterAe)) {
System.out.println("Capital Ae, and small ae are the same (case insensitive) in the US Locale.");
}
}
}
Output is follows:
Capital Ae, and small ae are the same (case insensitive) in the german Locale.
Capital Ae, and small ae are the same (case insensitive) in the US Locale.
I find this surprising. I'd expect the US Ascii Locale to treat exactly [a-zA-Z] as upper/lowercaseable.
Thanks,
Jochen
Upvotes: 1
Views: 921
Reputation: 20802
Unicode has a default case mapping. Java's String.toLowerCase(Locale) uses the default where there is no "tailoring" data provided for the specified Locale's case mapping of the string. It would be more astonishing if letters had a case change in every locale where they are used but not in locales where they are not used.
BTW—English uses more Unicode letters than A-Z and a-z, anyway. You've got my antennæ up on this one.
Upvotes: 1
Reputation: 6200
The javadoc of the toLowerCase(Locale)
is pretty clear about it:
Converts all of the characters in this String to lower case using the rules of the given Locale. Case mapping is based on the Unicode Standard version specified by the Character class. Since case mappings are not always 1:1 char mappings, the resulting String may be a different length than the original String.
This means the Locale parameter is used to retrieve translation rules. Maybe you confused it with a Charset, which is something different.
Upvotes: 1
Reputation: 718906
There is a bug in your code:
if (capitalLetterAe.toLowerCase(Locale.GERMANY).equals(smallLetterAe)) {
System.out.println("Capital Ae, and small ae are the " +
"same (case insensitive) in the US Locale.");
}
You are using Locale.GERMANY
but the message says "in the US Locale".
Upvotes: 1