Reputation: 40
I have a list of strings and i want to compare it with "singleArgument" , i dont want it to be case sensitive so i made a method to make it lowerCase but also i dont want special characters to mess up comparison so if im looking for "ščž" singleArgument can be "scz"
case noCaseSensitive:
final String patternSourceILike = (String) singleArgument;
verdict = buildPattern(patternSourceILike.toLowerCase(Locale.ROOT))
.matcher(((String) resolvedValue).toLowerCase(Locale.ROOT))
.matches();
break;
this i have for no case sensitive comparison.
If i convert string from utf8 to ascii and than compare it turns special characters to unknown characters.
Upvotes: 0
Views: 337
Reputation: 53597
No idea why you'd want to do this, since removing diacritics from letters makes them completely different letters, but you can use java.text.Normalizer for this: normalize the text to its canonical decomposition, then replace all "not ascii letters" with empty strings to strip out all (now separate) diacritics.
import java.text.Normalizer;
public class Test {
public static void main(String []args) {
String input = "\u0161\u010D\u017E"; // ščž
String canonical = Normalizer.normalize(input, Normalizer.Form.NFD);
String ascii = canonical.replaceAll("\\W", "");
String output = String.format("%s, %s", input, ascii);
System.out.println(output); // "ščž, scz"
}
}
Upvotes: 1