Reputation: 369
I am trying to reference a certain unnamed <p>
Paragraph in html. I have identified the correct <div>
, which is stored in res
.
What I would like to do now is: Find the <p>
which <span>
contains the word "Country" and receive the value (here="Germany") as a character. Position matching doesn't work unfortunately, since position changes over pages.
res = res.find(name="div", attrs={"class": "item-result-text"}).find_all(name="p")
print(res)
Result:
[<p><span>Country: </span>Germany</p>, <p><span>Coly: </span> Print</p>]
print(type(res))
Result: <class 'bs4.element.ResultSet'>
Upvotes: 0
Views: 467
Reputation: 17368
I would try to do this in a different way though. I'm not sure if this is exactly what you are looking for
In [5]: a = "<p><span>Country: </span>Germany</p>, <p><span>Coly: </span> Print</p>"
In [6]: soup = BeautifulSoup(a,"lxml")
In [14]: values = {i.text.strip().split(":")[0].strip():i.text.strip().split(":")[1].strip() for i in soup.find_all("p")}
In [15]: values["Country"]
Out[15]: 'Germany'
Upvotes: 1