Reputation: 1
I have this variable String var = class.getSomething
that contains this url http://www.google.com§°§#[]|£%/^<>
.The output that comes out is this: http://www.google.comç°§#[]|£%/^<>
. How can i delete that Ã? Thanks!
Upvotes: 0
Views: 2065
Reputation: 31300
The string in var
is output using utf-8, which results in the byte sequence:
c2 a7 c2 b0 c2 a7 23 5b 5d 7c c2 a3 25 2f 5e 3c 3e
This happens to be the iso-8859-1 encoding of the characters as you see them:
§ ° §#[]| £%/^<>
ç°§#[]|£%/^<>
C2 is the encoding for Â.
I'm not sure how the à was produced; it's encoding is C3.
We need the full code to learn how this happened, and a description how the character encoding for text files on your system is configured.
Modifying the variable var
is useless.
Upvotes: 0
Reputation: 2439
You could do this, it replaces any character for empty getting your purpouse.
str = str.replace("Â", "");
With that you will replace  for nothing, getting the result you want.
Upvotes: 1
Reputation: 335
Do you really want to delete only that one character or all invalid characters? Otherwise you can check each character with CharacterUtils.isAsciiPrintable(char ch)
. However, according to RFC 3986 even fewer character are allowed in URLs (alphanumerics and "-_.+=!*'()~,:;/?$@&%", see Characters allowed in a URL).
In any case, you have to create a new String object (like with replace in the answer by Elias MP or putting valid characters one by one into a StringBuilder
and convert it to a String) as Strings are immutable in Java.
Upvotes: 0
Reputation: 1219
specify the charset as UTF-8
to get rid of unwanted extra chars :
String var = class.getSomething;
var = new String(var.getBytes(),"UTF-8");
Upvotes: 0