python
python

Reputation: 4521

How to handle links containing space between them in Python

I am trying to extract links from a webpage and then open them in my web browser. My Python program is able to successfully extract the links, but some links have spaces between them which cannot be open using request module.

For example example.com/A, B C it will not open using the request module. But if I convert it into example.com/A,%20B%20C it will open. Is there a simple way in python to fill the spaces with %20 ?

`http://example.com/A, B C` ---> `http://example.com/A,%20B%20C`

I want to convert all links which have spaces between them into the above format.

Upvotes: 5

Views: 3754

Answers (3)

SANDEEP MACHIRAJU
SANDEEP MACHIRAJU

Reputation: 997

Python 3 working solution for @rofls answer.

import urllib.parse as urlparse
def url_fix(s):
    scheme, netloc, path, qs, anchor = urlparse.urlsplit(s)
    path = urlparse.quote(path, '/%')
    qs = urlparse.quote_plus(qs, ':&=')
    return urlparse.urlunsplit((scheme, netloc, path, qs, anchor))

Upvotes: 1

rofls
rofls

Reputation: 5115

urlencode actually takes a dictionary, for example:

>>> urllib.urlencode({'test':'param'})
'test=param'`

You actually need something like this:

import urllib
import urlparse

def url_fix(s, charset='utf-8'):
    if isinstance(s, unicode):
        s = s.encode(charset, 'ignore')
    scheme, netloc, path, qs, anchor = urlparse.urlsplit(s)
    path = urllib.quote(path, '/%')
    qs = urllib.quote_plus(qs, ':&=')
    return urlparse.urlunsplit((scheme, netloc, path, qs, anchor))

Then:

>>>url_fix('http://example.com/A, B C')    
'http://example.com/A%2C%20B%20C'

Taken from How can I normalize a URL in python

Upvotes: 5

ergonaut
ergonaut

Reputation: 7057

use url encode:

import urllib
urllib.urlencode(yourstring)

Upvotes: 1

Related Questions