Liu Yu
Liu Yu

Reputation: 391

For loop parse XML with Python

XML file

<?xml version="1.0"?>
 <productListing title="Python Products">
  <product id="1">
   <name>Python Hoodie</name>
   <description>This is a Hoodie</description>
   <cost>$49.99</cost>
   <shipping>$2.00</shipping>
  </product>
  <product id="2">
   <name>Python shirt</name>
   <description>This is a shirt</description>
   <cost>$79.99</cost>
   <shipping>$4.00</shipping>
  </product> 
  <product id="3">
   <name>Python cap</name>
   <description>This is a cap</description>
   <cost>$99.99</cost>
   <shipping>$3.00</shipping>
  </product> 
</productListing>

import xml.etree.ElementTree as et
import pandas as pd
import numpy as np

import all the libraries

tree = et.parse("documents/pythonstore.xml")

I put this file under documents

root = tree.getroot()
for a in range(3):
  for b in range(4):
     new=root[a][b].text
     print (new)

print out all the children in the XML.

df=pd.DataFrame(columns=['name','description','cost','shipping'])

created a dataframe to store all the children in XML

My questions:

Could somebody please help me! Thank you so much!

Upvotes: 1

Views: 165

Answers (1)

Rakesh
Rakesh

Reputation: 82805

This might help.

# -*- coding: utf-8 -*-
s = """<?xml version="1.0"?>
 <productListing title="Python Products">
  <product id="1">
   <name>Python Hoodie</name>
   <description>This is a Hoodie</description>
   <cost>$49.99</cost>
   <shipping>$2.00</shipping>
  </product>
  <product id="2">
   <name>Python shirt</name>
   <description>This is a shirt</description>
   <cost>$79.99</cost>
   <shipping>$4.00</shipping>
  </product> 
  <product id="3">
   <name>Python cap</name>
   <description>This is a cap</description>
   <cost>$99.99</cost>
   <shipping>$3.00</shipping>
  </product> 
</productListing>"""

import xml.etree.ElementTree as et
tree = et.fromstring(s)
root = tree
res = []
for a in range(3):
    r = []
    for b in range(4):
        new=root[a][b].text
        r.append(new)
    res.append(r)

print res
df=pd.DataFrame(res, columns=['name','description','cost','shipping'])
print df

Output:

[['Python Hoodie', 'This is a Hoodie', '$49.99', '$2.00'], ['Python shirt', 'This is a shirt', '$79.99', '$4.00'], ['Python cap', 'This is a cap', '$99.99', '$3.00']]

            name       description    cost shipping
0  Python Hoodie  This is a Hoodie  $49.99    $2.00
1   Python shirt   This is a shirt  $79.99    $4.00
2     Python cap     This is a cap  $99.99    $3.00

Upvotes: 1

Related Questions