Michael
Michael

Reputation: 349

Find a string using BeautifulSoup in Python

I need to extract "/html/path" from strings like these:

generic/html/path/generic/generic/generic

I just need "path" and it's always coming after "html/". So it there a way to search for "html/" and get the string after it until a "/" is coming?

Upvotes: 0

Views: 161

Answers (3)

RocketDonkey
RocketDonkey

Reputation: 37249

Another one to add to the mix:

In [1]: s = 'generic/html/path/generic/generic/generic'

In [2]: s.split('html/')[1].split('/')[0]
Out[2]: 'path'

Upvotes: 6

tbraun89
tbraun89

Reputation: 2234

You can use regex:

>>> regex = re.compile(".+html/(.+?)/")
>>> r = regex.search("generic/html/path/generic/generic/generic")
>>> r.groups()
(u'path',)

Python DOC: http://docs.python.org/3.3/library/re.html

Upvotes: 1

loopbackbee
loopbackbee

Reputation: 23312

This is just basic string manipulation

s="generic/html/path/generic/generic/generic"
i1= s.index("html/") + 5
i2= s.index("/", i1)
print s[i1:i2]

Upvotes: 1

Related Questions