user9517536248
user9517536248

Reputation: 355

Edit the string from particular character to a particular character

I'm looking for a way to edit my string. My string is like this http://www.example.com/example:8080 now what i want to do is find the third occurrence of "/" and then edit the string to http://www.example.com:8080 so basically remove what ever is between third occurrence of "/" and second occurrence of ":". I tried writing a regular expression and was able to get to the first part it looks like this ((.*?/){3}(.*)) but how to get through the second task and get the final string?

Thanks

EDIT :

The number of times the "/" occurs is not a concern guys. It can even be http://www.example.com/example/index.php:8080 What i want is from the third occurrence of "/" to the second occurrence of ":" the content should be removed or deleted and we finally should have a string as http://www.example.com:8080

Upvotes: 2

Views: 154

Answers (4)

Stephan
Stephan

Reputation: 17971

Since you haven't accepted an answer, you might be stuck, Here is an example that will do the trick explained by other answers.

from urllib2 import urlparse

url = 'http://www.example.com/example:8080'
parsedURL = urlparse.urlparse(url)
port = url.split(':')[2] 
fixedURL = parsedURL.scheme + '://' + parsedURL.netloc + ':' + port

The first line accepts the url and parses it
The second line reformats it by cutting out everything after the / and before the :

This will only work if your port is on the end and there are only 2 :s

Upvotes: 0

Hai Vu
Hai Vu

Reputation: 40688

I have two solutions: use the urlparse module (preferred) and regular expression.

import urlparse
import re

# METHOD 1: use urlparse
# Parse the incorrect URL
incorrect_url = 'http://www.example.com/example:8080'
scheme, netloc, path, params, query, fragment =  urlparse.urlparse(incorrect_url)

# Fix up
path, port = path.split(':')
netloc = netloc + ':' + port
path = ''

# Putting them all together
correct_url = urlparse.urlunparse((scheme, netloc, path, params, query, fragment))
print correct_url


# METHOD 2: use regular expression
scheme, dummy1, dummy2, netloc, path, port=re.split(r'[/:]', incorrect_url)
correct_url = '{}://{}:{}'.format(scheme, netloc, port)
print correct_url

In general, when dealing with URLs, I prefer the right tool: urlparse. The regular expression solution has the advantage of being shorter, but might get you into trouble for some corner cases.

Upvotes: 0

Bibhas Debnath
Bibhas Debnath

Reputation: 14929

Not an exact answer to the question but might solve the problem. If that's how the url is always, you could use the urlparse module from urllib2.

In [9]: from urllib2 import urlparse

In [10]: parsed_url = urlparse.urlparse('http://www.example.com/example:8080')

In [11]: parsed_url
Out[11]: ParseResult(scheme='http', netloc='www.example.com', path='/example:8080', params='', query='', fragment='')

In [12]: parsed_url.path
Out[12]: '/example:8080'

In [13]: parsed_url.path.split(':')
Out[13]: ['/example', '8080']

Rest you can do I think.

Upvotes: 1

FishFace
FishFace

Reputation: 421

A simple but ugly way would be:

>>> x = 'http://www.example.com/example:8080'
>>> x.find('/',x.find('/',x.find('/')+1)+1)
22
>>> x.rfind(':')
30
>>> x[:22] + x[30:]
'http://www.example.com:8080'

Note that rfind() searches backwards. Beware this might go wrong if your URL doesn't look as it you expect it to. The x[:22] and x[:30] parts are examples of slicing, a useful feature of python. For more information, you could read the tutorial for strings in python.

Upvotes: 2

Related Questions