Subset a bs4.element.ResultSet by string match

Question

I am trying to reference a certain unnamed

Paragraph in html. I have identified the correct

, which is stored in res.

What I would like to do now is: Find the

which contains the word "Country" and receive the value (here="Germany") as a character. Position matching doesn't work unfortunately, since position changes over pages.

res = res.find(name="div", attrs={"class": "item-result-text"}).find_all(name="p")
print(res)

Result:

[Country: Germany
, Coly:  Print]

print(type(res))

Result:

bigbounty · Accepted Answer

I would try to do this in a different way though. I'm not sure if this is exactly what you are looking for

In [5]: a = "Country: Germany
, Coly:  Print"
In [6]: soup = BeautifulSoup(a,"lxml")
In [14]: values = {i.text.strip().split(":")[0].strip():i.text.strip().split(":")[1].strip() for i in soup.find_all("p")}

In [15]: values["Country"]
Out[15]: 'Germany'

Subset a bs4.element.ResultSet by string match

Answers (1)

Related Questions