instanceOfObject
instanceOfObject

Reputation: 2984

Remove hexadecimal characters from URL

Please deal with this trivial question.

I am getting some URLs like "SOME_DOMAIN?q\x3dnintendo+mathe\x26um\x3d1\x26ie\x3dUTF-8\x26tbm\x3dshop\x26cid\x3d8123694338777545283\x26sa\x3dX\x26ei\x3dL8cjUJmHO8L30gGa1ICgCw\x26ved\x3d0CI4BEIIIMAk" which contains some escape characters.

What is the best way to remove these hexadecimal characters? I have this below snippet which solves my problem as of now but doesn't look like a reliable solution.

    url = url.replace("\\x2F","/");
    url = url.replace("\\x26","&");
    url = url.replace("\\x3d","=");
    url = url.replace("\\x2F","/");
    url = url.replace("\\x2F","/");

I haven't faced this issue but spaces might appear between the URL. Should URLDecoder.decode solve my problem?

Kindly advice.

Thanks

Upvotes: 2

Views: 3599

Answers (1)

Nishant
Nishant

Reputation: 55866

This works

   URLDecoder.decode(yourURLString.replace("\\x", "%"), "UTF-8")

see this in action :)

public static void main(String[] args) throws UnsupportedEncodingException {
    String s = "SOME_DOMAIN?q\\x3dnintendo+mathe\\x26um\\x3d1\\x26ie\\x3dUTF-8\\x26tbm\\x3dshop\\x26cid\\x3d8123694338777545283\\x26sa\\x3dX\\x26ei\\x3dL8cjUJmHO8L30gGa1ICgCw\\x26ved\\x3d0CI4BEIIIMAk";
    System.out.println(URLDecoder.decode(s.replace("\\x", "%"), "UTF-8"));

}

returns

SOME_DOMAIN?q=nintendo mathe&um=1&ie=UTF-8&tbm=shop&cid=8123694338777545283&sa=X&ei=L8cjUJmHO8L30gGa1ICgCw&ved=0CI4BEIIIMAk

Basically, you need to replace \x with % and decode it using:

 URLDecoder.decode(url, "UTF-8");

see here

http://docs.oracle.com/javase/1.5.0/docs/api/java/net/URLDecoder.html#decode%28java.lang.String,%20java.lang.String%29

Upvotes: 5

Related Questions