Fredo
Fredo

Reputation: 320

Java Split ISO-8859-1 String with "broken vertical bar"

I read from a third system an ISO-8859-1 coded String. I have to split this String with the character ¦. This char has in ISO-8859-1 the value 166. The following code doesn't work, because the value in Java (UTF-8) of ¦ is 65533.

String [] parts = isoString.split("¦");

I am stuck... How can I solve this? Thanks

Upvotes: 1

Views: 761

Answers (2)

errantlinguist
errantlinguist

Reputation: 3818

You first need to properly decode your ISO-8859-1 string into a Unicode representation so that you can split it using the Unicode string literal you supplied (|)-- assuming you're compiling your program using Unicode encoding of course.

Upvotes: 0

JB Nizet
JB Nizet

Reputation: 691705

Working code:

String s = new String(new byte[] {'a', 'b', (byte) 166, 'c', 'd'}, 
                      StandardCharsets.ISO_8859_1);
String[] split = s.split("\u00a6");
System.out.println("split = " + Arrays.toString(split));
// prints split = [ab, cd]

Upvotes: 2

Related Questions