ts178
ts178

Reputation: 329

Extracting a tag value in Beautiful Soup

I am parsing a html document using beautiful soup in python.

I came across a tag like this

div class="_3auQ3N">\u20b9<!-- -->1,990</div>

\u20bp represents currency symbol and 1,990 is the price.

I want to know how can I extract these values into two different Strings (or values)?

Upvotes: 0

Views: 110

Answers (2)

radzak
radzak

Reputation: 3118

>>> soup = BeautifulSoup('<div class="_3auQ3N">\u20b9<!-- -->1,990</div>', 'lxml')
>>> list(soup.div.strings)
['₹', '1,990']

Upvotes: 4

Gsk
Gsk

Reputation: 2945

Once you have extracted your string, you may yse regex:

import re


string = "\u20b9<!-- -->1,990"
a = re.findall("(^.*)<!-- -->(.*)", string)
print(a[0][0],a[0][1]) # ₹ 1,990

Upvotes: 0

Related Questions