Reputation: 151
The title is a mouthful but describes what I want. I am parsing through an XML with BeautifulSoup. The format of my XML is as follows:
<properties>
<place>
<house_id>12345</house_id>
<appliances>Fridge, Oven</appliances>
<price>350000</price>
</place>
<place>
<house_id>6789</house_id>
<appliances>Heater, Microwave, Fridge</appliances>
<price>870000</price>
</place>
</properties>
Given a specific value for the house_id
tag, I want the text INSIDE of the appliances
that correspond to that place. For instance, given 12345
, I want to return Fridge, Oven
. I have not found an easy way to do this yet with BeautifulSoup.
Upvotes: 0
Views: 173
Reputation: 4779
This will work even if <appliances>
tag is either before or after the <house_id>
.
Use findParent()
to find the parent of <house_id>
and then find the tag <appliances>
in that parent.
Here is the code
from bs4 import BeautifulSoup
s = """
<properties>
<place>
<house_id>12345</house_id>
<appliances>Fridge, Oven</appliances>
<price>350000</price>
</place>
<place>
<house_id>6789</house_id>
<appliances>Heater, Microwave, Fridge</appliances>
<price>870000</price>
</place>
<place>
<appliances>Oven, Cleaner, Microwave</appliances>
<price>700000</price>
<house_id>1296</house_id>
</place>
</properties>"""
soup = BeautifulSoup(s, 'xml')
def get_appliance(t, soup):
h = soup.find('house_id', text=t)
appliance = h.findParent().find('appliances')
return appliance.text
print(get_appliance('12345', soup))
print(get_appliance('1296', soup))
Fridge, Oven
Oven, Cleaner, Microwave
Upvotes: 0
Reputation: 20038
You can use the General Sibling Combinator (~
):
soup.select_one("house_id:-soup-contains('12345') ~ appliances").text
Or you can find the <house_id>
tag containing specific text, and then call find_next()
to locate the <appliances>
tag:
print(soup.find("house_id", text="12345").find_next("appliances").text)
Upvotes: 1
Reputation: 22187
Based on your input XML, the following XPath expression will produce what you need.
can we use XPath with BeautifulSoup?
XPath
/properties/place[house_id="12345"]/appliances
Upvotes: 0