How to convert xml to dataFrame with pandas

Question

I am newly in pandas and I just start my code learning. Please, it would be great if you could help me. I have a simple XML like this and I wanna convert it in a dataframe with pandas

I use some code but anyway it does not help me:

    import pandas as pd
    import xml.etree.ElementTree as et
    xtree = et.parse("file.xml")
    xroot = xtree.getroot()
    df_cols = ["product"]
    rows = []
    for node in xroot:
    s_product = node.attrib.get("product")
    rows.append({"name":  s_product
                 })
    out_df = pd.DataFrame(rows, columns = df_cols)

balderman · Accepted Answer

The code below flatting the xml (region,products,product) into a single record.

import xml.etree.ElementTree as ET

import pandas as pd

xml = '''
    
        
            0
            5
            3
        
    
    
        
            7
            3
            1
        
    
'''

data = []
root = ET.fromstring(xml)
regions = root.findall('.//region')
for region in regions:
    region_id = region.attrib['id']
    products_count = region.find('./products').attrib['count']
    for product in region.findall('.//product'):
        entry = {'region_id': region_id, 'products_count': products_count,
                 'product_id': product.attrib['id'], 'number': product.text}
        data.append(entry)
df = pd.DataFrame(data)
print(df)

output

  region_id products_count product_id number
0       122       45453242    1000001      0
1       122       45453242    1000002      5
2       122       45453242    1000003      3
3       133       45453277    1000004      7
4       133       45453277    1000005      3
5       133       45453277    1000006      1

How to convert xml to dataFrame with pandas

Answers (2)

Related Questions