python extract id value from href source

Question

I've managed to extract the href URI's using beautifulsoup from the source of the page, however I now want to extract the UID value from multiple instances of the example below:

e.g

Help would be greatly appreciated!

zhangyangyu · Accepted Answer

>>> html
'

'
>>> soup = BeautifulSoup(html)
>>> ass = soup.find_all('a')
>>> r = re.compile('uid=(\d+)')
>>> uids = []
>>> for a in ass:
...     uids.append(r.search(a['href']).group(1))
... 
>>> uids
['5444974', '5444972', '54444972']
>>>

python extract id value from href source

Answers (2)

Related Questions