Reputation: 3396
I have a huge XML file that crashes in Notepad++ when I try to search for anything. I'm trying to parse it in Python 3.8 using ElementTree. I have no experience with parsing XML files; I looked at the docs, but I am still not sure where to start for my particular need.
Specifically, I want to search for a number within the index as highlighted in Line25 of the screenshot; for example, I want to search for "3" and then get the previous line's "rate" returned and the current line's "depth" returned. But I would also want the same information for other ranges that include "3" like in Line21.
There is only one root.tag
which is at the top of the screenshot. The "children" are all called Source
and don't have have any attributes.
I haven't made it any farther than import xml.etree.ElementTree as ET
and root = ET.parse(filein).getroot()
in the code.
Upvotes: 0
Views: 125
Reputation: 2469
Try this.
from simplified_scrapy import SimplifiedDoc
doc = SimplifiedDoc()
doc.loadFile('test.xml', lineByline=True)
for src in doc.getIterable('Source'):
if src.outerHtml.find('3')>0:
print (src.IncrementalMfd.rate)
print (src.Geometry.depth)
Upvotes: 0