Extracting attributes from elements in xml

Question

I have a script that extracts the text and attributes from a number of xpaths. Each entry's data is appended to a list as it is extracted (all attributes followed by the text before moving on to the next xpath) and then that list is inserted into a data frame. My problem is that not every entry has the same attributes per xpath. So, for example, all entries have the element and at least one corresponding attribute (color) (ie. , but then some cat elements may have an additional attribute(s) (i.e. ) that not all cat element have. This presents an issue when the row is inserted into the data frame as the length won't match the number of columns. The order of the attributes does remain uniform unless one is missing. I need a way to insert a blank string when an attribute is effectively skipped for not being in an element.

for next_url in next_url_list:
    response = urllib.request.urlopen(next_url)
    bytes_ = response.read()
    root = xml.etree.ElementTree.fromstring(bytes_)

    for count in range(0,len(root.findall("./xpath:entry", namespaces=namespaces))):
    
        for xpath in xpaths:
            try:
                attribs = list(root.findall(xpath,namespaces=namespaces)[count].attrib.keys())
            
                for attrib in attribs:
                        award.append(root.findall(xpath, namespaces=namespaces)[count].attrib[attrib])
                    
                    award.append(root.findall(xpath, namespaces=namespaces)[count].text)
                
            except IndexError:
                pass

wwii · Accepted Answer

I need a way to insert a blank string when an attribute is effectively skipped for not being in an element.

for each element make a dictionary of expected attributes with an empty string for the values.
- ```
{'a1':'','a2':'',...}
```
when you extract an attribute from an element update the dictionary value
use the dictionary to construct the row - missing attributes will have empty strings as values.

Extracting attributes from elements in xml

Answers (1)

Related Questions