How to read all xml files one by one and process them one by one

Question

I'm parsing xml file on jupyter notebook and I use this code to open a file:

from lxml import etree as ET
tree = ET.parse('C:\Users\mysky\Documents\Decoded\F804187.xml')
root = tree.getroot()

And after that I do some processing with xpath and pandas, for example I do:

CODE = [ ]
for errors in root.findall('.//Book/Message/Param/Buffer/Data/Field[11]'):
    error_code = errors.find('RawValue').text
    if error_code is not None:
        CODE.append(error_code)

I have about 10 small code blocks like that for extracting my data and at the end I save my dataframe in a CSV file.

I have a lot of xml file and i want to read all files of my Decoded directory one by one and then process them one by one too and append each result in my CSV file.

Thanks!

Qback · Accepted Answer

To list all xml files in your directory you can use for example glob (second answer).

It can look like this:

import glob

files = glob.glob('C:\Users\mysky\Documents\Decoded\*.xml')

    for file in files:
        tree = ET.parse(file)
        root = tree.getroot()
        CODE = [ ]
        for errors in root.findall('.//Book/Message/Param/Buffer/Data/Field[11]'):
            error_code = errors.find('RawValue').text
            if error_code is not None:
                CODE.append(error_code)

How to read all xml files one by one and process them one by one

Answers (1)

Related Questions