jnshsrs
jnshsrs

Reputation: 337

Look for XML tag value and select the parent which contains the requested tag value

I have a lot of XML files which I get via the Google Geo API. I'm interested in the value of the tag long_name where the type tag contains the value route

I can select this value with following code:

from bs4 import BeautifulSoup as bs

xml_data = '''
<result>
    <formatted_address>Pariser Platz, 10117 Berlin, Deutschland</formatted_address>
    <address_component>
        <long_name>Pariser Platz</long_name>
        <type>route</type>
    </address_component>
    <address_component>
        <long_name>Mitte</long_name>
    <type>sublocality_level_1</type>
    </address_component>
</result>
'''

bsObj = bs(xml_data, 'html.parser')

bsObj.find_all('long_name')[1].string

Unfortunately the index (in this example the index is 1) of the desired XML Tag changes sometimes so I will not get the route tag every time. So Im looking for a strategy that first looks for the type value route and then select the previous sibling.

Upvotes: 1

Views: 33

Answers (1)

gtlambert
gtlambert

Reputation: 11961

To select the previous long_name sibling of the first type tag with text equal to route, use:

long_name_tag = bsObj.find('type', text='route').findPreviousSibling('long_name')

Alternatively, to return the text string from the relevant long_name tag, use:

long_name_tag_text = bsObj.find('type', text='route').findPreviousSibling('long_name').text

Upvotes: 1

Related Questions