Srikar Appalaraju
Srikar Appalaraju

Reputation: 73588

Python unescape URL

I have got a url in this form - http:\\/\\/en.wikipedia.org\\/wiki\\/The_Truman_Show. How can I make it normal url. I have tried using urllib.unquote without much success.

I can always use regular expressions or some simple string replace stuff. But I believe that there is a better way to handle this...

Upvotes: 4

Views: 8705

Answers (3)

Denis Barmenkov
Denis Barmenkov

Reputation: 2309

It is too childish -- look for some library function when you can transform URL by yourself. Since there are not other visible rules but "/" replaced by "\/", you can simply replace it back:

def unescape_this(url):
    return url.replace(r"\\/", "/")

Upvotes: 1

aaronasterling
aaronasterling

Reputation: 70984

Have you tried using json.loads from the json module?

>>> json.loads('"http:\\/\\/en.wikipedia.org\\/wiki\\/The_Truman_Show"')
'http://en.wikipedia.org/wiki/The_Truman_Show'

The input that I'm showing isn't exactly what you have. I've wrapped it in double quotes to make it valid json.

When you first get it from the json, how are you decoding it? That's probably where the problem is.

Upvotes: 5

Angus
Angus

Reputation: 1350

urllib.unquote is for replacing %xx escape codes in URLs with the characters they represent. It won't be useful for this.

Your "simple string replace stuff" is probably the best solution.

Upvotes: 11

Related Questions