user3611
user3611

Reputation: 151

Getting value of a tag inside the same parent tag as a tag with a particular value

The title is a mouthful but describes what I want. I am parsing through an XML with BeautifulSoup. The format of my XML is as follows:

<properties>
    <place>
        <house_id>12345</house_id>
        <appliances>Fridge, Oven</appliances>
        <price>350000</price>
    </place>
    <place>
        <house_id>6789</house_id>
        <appliances>Heater, Microwave, Fridge</appliances>
        <price>870000</price>
    </place>
</properties>

Given a specific value for the house_id tag, I want the text INSIDE of the appliances that correspond to that place. For instance, given 12345, I want to return Fridge, Oven. I have not found an easy way to do this yet with BeautifulSoup.

Upvotes: 0

Views: 173

Answers (3)

Ram
Ram

Reputation: 4779

This will work even if <appliances> tag is either before or after the <house_id>.

Use findParent() to find the parent of <house_id> and then find the tag <appliances> in that parent.

Here is the code

from bs4 import BeautifulSoup

s = """
<properties>
    <place>
        <house_id>12345</house_id>
        <appliances>Fridge, Oven</appliances>
        <price>350000</price>
    </place>
    <place>
        <house_id>6789</house_id>
        <appliances>Heater, Microwave, Fridge</appliances>
        <price>870000</price>
    </place>
    <place>
        <appliances>Oven, Cleaner, Microwave</appliances>
        <price>700000</price>
        <house_id>1296</house_id>
    </place>
</properties>"""

soup = BeautifulSoup(s, 'xml')


def get_appliance(t, soup):
    h = soup.find('house_id', text=t)
    appliance = h.findParent().find('appliances')
    return appliance.text


print(get_appliance('12345', soup))
print(get_appliance('1296', soup))
Fridge, Oven
Oven, Cleaner, Microwave

Upvotes: 0

MendelG
MendelG

Reputation: 20038

You can use the General Sibling Combinator (~):

soup.select_one("house_id:-soup-contains('12345') ~ appliances").text

Or you can find the <house_id> tag containing specific text, and then call find_next() to locate the <appliances> tag:

print(soup.find("house_id", text="12345").find_next("appliances").text)

Upvotes: 1

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22187

Based on your input XML, the following XPath expression will produce what you need.

can we use XPath with BeautifulSoup?

XPath

/properties/place[house_id="12345"]/appliances

Upvotes: 0

Related Questions