Manish Gupta
Manish Gupta

Reputation: 4666

Python xml to csv

Please read entire question before marking duplicate.
I have a nested XML file which i Want to convert to a csv file. I have to write a python script for same.

The XML file is:

<?xml version="1.0"?>
<ListOrdersResponse xmlns="https://mws.amazonservices.com/Orders/2013-09-01">
  <ListOrdersResult>
    <Orders>
      <Order>
        <LatestShipDate>2015-06-02T18:29:59Z</LatestShipDate>
        <OrderType>StandardOrder</OrderType>
        <PurchaseDate>2015-05-31T03:58:30Z</PurchaseDate>
        <AmazonOrderId>171-6355256-9594715</AmazonOrderId>
        <LastUpdateDate>2015-06-01T04:18:58Z</LastUpdateDate>
        <ShipServiceLevel>IN Std Domestic</ShipServiceLevel>
        <NumberOfItemsShipped>0</NumberOfItemsShipped>
        <OrderStatus>Canceled</OrderStatus>
        <SalesChannel>Amazon.in</SalesChannel>
        <NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
        <IsPremiumOrder>false</IsPremiumOrder>
        <EarliestShipDate>2015-05-31T18:30:00Z</EarliestShipDate>
        <MarketplaceId>A21TJRUUN4KGV</MarketplaceId>
        <FulfillmentChannel>MFN</FulfillmentChannel>
        <IsPrime>false</IsPrime>
        <ShipmentServiceLevelCategory>Standard</ShipmentServiceLevelCategory>
    </Order>
    <Order>   
        <LatestShipDate>2015-06-02T18:29:59Z</LatestShipDate>
        <OrderType>StandardOrder</OrderType>
        <PurchaseDate>2015-05-31T04:50:07Z</PurchaseDate>
        <BuyerEmail>[email protected]</BuyerEmail>
        <AmazonOrderId>403-5551715-2566754</AmazonOrderId>
        <LastUpdateDate>2015-06-01T07:52:49Z</LastUpdateDate>
        <ShipServiceLevel>IN Exp Dom 2</ShipServiceLevel>
        <NumberOfItemsShipped>2</NumberOfItemsShipped>
        <OrderStatus>Shipped</OrderStatus>
        <SalesChannel>Amazon.in</SalesChannel>
        <ShippedByAmazonTFM>false</ShippedByAmazonTFM>
        <LatestDeliveryDate>2015-06-06T18:29:59Z</LatestDeliveryDate>
        <NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
        <BuyerName>Ajit Nair</BuyerName>
        <EarliestDeliveryDate>2015-06-02T18:30:00Z</EarliestDeliveryDate>
        <OrderTotal>
          <CurrencyCode>INR</CurrencyCode>
          <Amount>938.00</Amount>
        </OrderTotal>
        <IsPremiumOrder>false</IsPremiumOrder>
        <EarliestShipDate>2015-05-31T18:30:00Z</EarliestShipDate>
        <MarketplaceId>A21TJRUUN4KGV</MarketplaceId>
        <FulfillmentChannel>MFN</FulfillmentChannel>
        <TFMShipmentStatus>Delivered</TFMShipmentStatus>
        <PaymentMethod>Other</PaymentMethod>
        <ShippingAddress>
          <StateOrRegion>MAHARASHTRA</StateOrRegion>
          <City>THANE</City>
          <Phone>9769994355</Phone>
          <CountryCode>IN</CountryCode>
          <PostalCode>400709</PostalCode>
          <Name>Ajit Nair</Name>
          <AddressLine1>C-25 / con-7 / Chandralok CHS</AddressLine1>
          <AddressLine2>Sector-10 ,Koper khairne</AddressLine2>
        </ShippingAddress>
        <IsPrime>false</IsPrime>
        <ShipmentServiceLevelCategory>Expedited</ShipmentServiceLevelCategory>
      </Order>

I tried to get values for my code in form of a list. But it doesn't print anything.

My Code:

from xml.etree import ElementTree

with open('orders.xml', 'rb') as f:
tree = ElementTree.parse(f)

for node in tree.findall('.//Order'):
    oid = node.attrib.get('SellerOrderId')
    if oid:
        print oid

What is wrong with my code?

EDIT: Temporary link to complete File Orders.xml

Upvotes: 1

Views: 507

Answers (1)

har07
har07

Reputation: 89285

Your XML has default namespace defined here :

<ListOrdersResponse xmlns="https://mws.amazonservices.com/Orders/2013-09-01">

Note that descendant elements inherits ancestor default namespace implicitly, unless otherwise specified. You need to combine namespace + local name to form a fully qualified element name, for example :

ns = {'d': 'https://mws.amazonservices.com/Orders/2013-09-01'}     
for node in tree.findall('.//d:Order', ns):
    oid = node.attrib.get('SellerOrderId')
    if oid:
        print oid

According to the full XML file you linked to, SellerOrderId is child element of Order instead of attribute. In this case, you can simply use .//d:Order/d:SellerOrderId to get them and then print it's value, like so :

ns = {'d': 'https://mws.amazonservices.com/Orders/2013-09-01'}  
for node in tree.findall('.//d:Order/d:SellerOrderId', ns):
    print node.text

output :

171-1322776-9700344
171-4214129-7148305
402-8263846-7042737
402-7017923-9474716
402-9691237-2887553
171-4614227-7597903
403-6729903-2119563
402-2184564-2676353
171-4520392-2088330
402-7986969-8827533

Upvotes: 4

Related Questions