Reputation: 2369
In PHP I would use this:
$text = "Je prends une thé chaud, s'il vous plaît";
$search = array('é','î','è'); // etc.
$replace = array('e','i','e'); // etc.
$text = str_replace($search, $replace, $text);
But the Java String method "replace" doesn't seem to accept arrays as input. Is there a way to do this (without having to resort to a for loop to go through the array)?
Please say if there's a more elegant way than the method I'm attempting.
Upvotes: 2
Views: 2797
Reputation: 625097
You're going to have to do a loop:
String text = "Je prends une thé chaud, s'il vous plaît";
Map<Character, String> replace = new HashMap<Character, String>();
replace.put('é', "e");
replace.put('î', "i");
replace.put('è', "e");
StringBuilder s = new StringBuilder();
for (int i=0; i<text.length(); i++) {
char c = text.charAt(i);
String rep = replace.get(c);
if (rep == null) {
s.append(c);
} else {
s.append(rep);
}
}
text = s.toString();
Note: Some characters are replaced with multiple characters. In German, for example, u-umlaut is converted to "ue".
Edit: Made it much more efficient.
Upvotes: 1
Reputation: 29576
A really nice way to do it is using the replaceEach()
method from the StringUtils
class in Apache Commons Lang 2.4.
String text = "Je prends une thé chaud, s'il vous plaît";
String[] search = new String[] {"é", "î", "è"};
String[] replace = new String[] {"e", "i", "e"};
String newText = StringUtils.replaceEach(text,
search,
replace);
Results in
Je prends une the chaud, s'il vous plait
Upvotes: 3
Reputation: 37215
I'm not a Java guy, but I'd recommend a generic solution using the Normalizer class to decompose accented characters and then remove the Unicode "COMBINING" characters.
Upvotes: 2
Reputation: 346317
There's no method that works identically to the PHP one in the standard API, though there may be something in Apache Commons. You could do it by replacing the characters individually:
s = s.replace('é','e').replace('î', 'i').replace('è', 'e');
A more sophisticated method that does not require you to enumerate the characters to substitute (and is thus more likely not to miss anything) but does require a loop (which will happen anyway internally, whatever method you use) would be to use java.text.Normalizer
to separate letters and diacritics and then strip out everything with a character type of Character.MODIFIER_LETTER
.
Upvotes: 2
Reputation: 56772
You'll need a loop.
An efficient solution would be something like the following:
Map<Character, Character> map = new HashMap<Character, Character>();
map.put('é', 'e');
map.put('î', 'i');
map.put('è', 'e');
StringBuilder b = new StringBuilder();
for (char c : text.toCharArray())
{
if (map.containsKey(c))
{
b.append(map.get(c));
}
else
{
b.append(c);
}
}
String result = b.toString();
Of course in a real program you would encapsulate both the construction of the map and the replacement in their respective methods.
Upvotes: 0
Reputation: 48629
There's no standard method as far as I know, but here's a class that does what you want:
http://www.javalobby.org/java/forums/t19704.html
Upvotes: 0