Reputation: 41
I received string from IBM Mainframe like below (2bytes graphic fonts)
" ;A;B;C;D;E;F;G;H;I;J;K;L;M;N;O;P;Q;R;S;T;U;V;W;X;Y;Z;a;b;c;d;e;f;g;h;i;j;k;l;m;n;o;p;q;r;s;t;u;v;w;x;y;z;0;1;2;3;4;5;6;7;8;9;`;-;=;₩;~;!;@;#;$;%;^;&;*;(;);_;+;|;[;];{;};:;";';,;.;/;<;>;?;";
and, I wanna change these characters to 1 byte ascii codes
How can I replace these using java.util.regex.Matcher, String.replaceAll() in Java
target characters :
;A;B;C;D;E;F;G;H;I;J;K;L;M;N;O;P;Q;R;S;T;U;V;W;X;Y;Z;a;b;c;d;e;f;g;h;i;j;k;l;m;n;o;p;q;r;s;t;u;v;w;x;y;z;0;1;2;3;4;5;6;7;8;9;`;-;=;\;~;!;@;#;$;%;^;&;*;(;);_;+;|;[;];{;};:;";';,;.;/;<;>;?;";
Upvotes: 2
Views: 468
Reputation: 75222
This is not (as other responders are saying) a character-encoding issue, but regexes are still the wrong tool. If Java had an equivalent of Perl's tr///
operator, that would be the right tool, but you can hand-code it easily enough:
public static String convert(String oldString)
{
String oldChars = " ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789`-=₩~!@#$%^&*()_+|[]{}:"',./<>?";
String newChars = " ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789`-=\\~!@#$%^&*()_+|[]{}:\"',./<>?";
StringBuilder sb = new StringBuilder();
int len = oldString.length();
for (int i = 0; i < len; i++)
{
char ch = oldString.charAt(i);
int pos = oldChars.indexOf(ch);
sb.append(pos < 0 ? ch : newChars.charAt(pos));
}
return sb.toString();
}
I'm assuming each character in the first string corresponds to the character at the same position in the second string, and that the first character (U+3000
, 'IDEOGRAPHIC SPACE') should be converted to an ASCII space (U+0020
).
Be sure to save the source file as UTF-8, and include the -encoding UTF-8
option when you compile it (or tell your IDE to do so).
Upvotes: 2
Reputation: 10775
Don't think this one's about regex, it's about encoding. Should be possible to read into a String with 2-byte and then write it with any other encoding. Look here for supported encodings.
Upvotes: 0