Mladen
Mladen

Reputation: 25816

Python - Extract important string information

I have the following string

http://example.com/variable/controller/id32434242423423234?param1=321&param2=4324342

How in best way to extract id value, in this case - 32434242423423234

Regardz, Mladjo

Upvotes: 1

Views: 1331

Answers (4)

Hernan
Hernan

Reputation: 6063

While Regex is THE way to go, for simple things I have written a string parser. In a way, is the (uncomplete) reverse operation of a string formatting operation with PEP 3101. This is very convenient because it means that you do not have to learn another way of specifying the strings.

For example:

>>> 'The answer is {:d}'.format(42)
The answer is 42

The parser does the opposite:

>>> Parser('The answer is {:d}')('The answer is 42') 
42

For your case, if you want an int as output

>>> url = 'http://example.com/variable/controller/id32434242423423234?param1=321&param2=4324342'
>>> fmt = 'http://example.com/variable/controller/id{:d}?param1=321&param2=4324342'
>>> Parser(fmt)(url)
32434242423423234

If you want a string:

>>> fmt = 'http://example.com/variable/controller/id{:s}?param1=321&param2=4324342'
>>> Parser(fmt)(url)
32434242423423234

If you want to capture more things in a dict:

>>> fmt = 'http://example.com/variable/controller/id{id:s}?param1={param1:s}&param2={param2:s}'
>>> Parser(fmt)(url)
{'id': '32434242423423234', 'param1': '321', 'param2': '4324342'}

or in a tuple:

If you want to capture more things in a dict:

>>> fmt = 'http://example.com/variable/controller/id{:s}?param1={:s}&param2={:s}'
>>> Parser(fmt)(url)
('32434242423423234', '321', '4324342')

Give it a try, it is hosted here

Upvotes: 0

Utku Zihnioglu
Utku Zihnioglu

Reputation: 4873

>>> import urlparse
>>> res=urlparse.urlparse("http://example.com/variable/controller/id32434242423423234?param1=321&param2=4324342")
>>> res.path
'/variable/controller/id32434242423423234'
>>> import posixpath
>>> posixpath.split(res.path)
('/variable/controller', 'id32434242423423234')
>>> directory,filename=posixpath.split(res.path)
>>> filename[2:]
'32434242423423234'

Using urlparse and posixpath might be too much for this case, but I think it is the clean way to do it.

Upvotes: 3

Mark Longair
Mark Longair

Reputation: 467321

You could just use a regular expression, e.g.:

import re

s = "http://example.com/variable/controller/id32434242423423234?param1=321&param2=4324342"

m = re.search(r'controller/id(\d+)\?',s)
if m:    
    print "Found the id:", m.group(1)

If you need the value as an number rather than a string, you can use int(m.group(1)). There are plenty of other ways of doing this that might be more appropriate, depending on the larger goal of your code, but without more context it's hard to say.

Upvotes: 8

kurumi
kurumi

Reputation: 25599

>>> s
'http://example.com/variable/controller/id32434242423423234?param1=321&param2=4324342'
>>> s.split("id")
['http://example.com/variable/controller/', '32434242423423234?param1=321&param2=4324342']
>>> s.split("id")[-1].split("?")[0]
'32434242423423234'
>>>

Upvotes: 2

Related Questions