Manish Gupta
Manish Gupta

Reputation: 4666

Conditional Search in XML Python

I have a xml file Orders.xml (excerpt follows):

<?xml version="1.0"?>
<ListOrdersResponse xmlns="https://mws.amazonservices.com/Orders/2013-09-01">
  <ListOrdersResult>
    <Orders>
      <Order>
        <LatestShipDate>2015-06-02T18:29:59Z</LatestShipDate>
        <OrderType>StandardOrder</OrderType>
        <PurchaseDate>2015-05-31T03:58:30Z</PurchaseDate>
        <AmazonOrderId>171-6355256-9594715</AmazonOrderId>
        <LastUpdateDate>2015-06-01T04:18:58Z</LastUpdateDate>
        <ShipServiceLevel>IN Std Domestic</ShipServiceLevel>
        <NumberOfItemsShipped>0</NumberOfItemsShipped>
        <OrderStatus>Canceled</OrderStatus>
        <SalesChannel>Amazon.in</SalesChannel>
        <NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
        <IsPremiumOrder>false</IsPremiumOrder>
        <EarliestShipDate>2015-05-31T18:30:00Z</EarliestShipDate>
        <MarketplaceId>A21TJRUUN4KGV</MarketplaceId>
        <FulfillmentChannel>MFN</FulfillmentChannel>
        <IsPrime>false</IsPrime>
        <ShipmentServiceLevelCategory>Standard</ShipmentServiceLevelCategory>
      </Order>
      <Order>
        <LatestShipDate>2015-06-02T18:29:59Z</LatestShipDate>
        <OrderType>StandardOrder</OrderType>
        <PurchaseDate>2015-05-31T04:50:07Z</PurchaseDate>
        <BuyerEmail>dr7h1rhy6457rng@marketplace.amazon.in</BuyerEmail>
        <AmazonOrderId>403-5551715-2566754</AmazonOrderId>
        <LastUpdateDate>2015-06-01T07:52:49Z</LastUpdateDate>
        <ShipServiceLevel>IN Exp Dom 2</ShipServiceLevel>
        <NumberOfItemsShipped>2</NumberOfItemsShipped>
        <OrderStatus>Shipped</OrderStatus>
        <SalesChannel>Amazon.in</SalesChannel>
        <ShippedByAmazonTFM>false</ShippedByAmazonTFM>
        <LatestDeliveryDate>2015-06-06T18:29:59Z</LatestDeliveryDate>
        <NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
        <BuyerName>Ajit Nair</BuyerName>
        <EarliestDeliveryDate>2015-06-02T18:30:00Z</EarliestDeliveryDate>
        <OrderTotal>
          <CurrencyCode>INR</CurrencyCode>
          <Amount>938.00</Amount>
        </OrderTotal>
        <IsPremiumOrder>false</IsPremiumOrder>
        <EarliestShipDate>2015-05-31T18:30:00Z</EarliestShipDate>
        <MarketplaceId>A21TJRUUN4KGV</MarketplaceId>
        <FulfillmentChannel>MFN</FulfillmentChannel>
        <TFMShipmentStatus>Delivered</TFMShipmentStatus>
        <PaymentMethod>Other</PaymentMethod>
        <ShippingAddress>
          <StateOrRegion>MAHARASHTRA</StateOrRegion>
          <City>THANE</City>
          <Phone>9769994355</Phone>
          <CountryCode>IN</CountryCode>
          <PostalCode>400709</PostalCode>
          <Name>Ajit Nair</Name>
          <AddressLine1>C-25 / con-7 / Chandralok CHS</AddressLine1>
          <AddressLine2>Sector-10 ,Koper khairne</AddressLine2>
        </ShippingAddress>
        <IsPrime>false</IsPrime>
        <ShipmentServiceLevelCategory>Expedited</ShipmentServiceLevelCategory>
      </Order>
    </Orders>
    <CreatedBefore>2015-06-08T06:45:22Z</CreatedBefore>
    <NextToken>smN7fNREdZyaJqJYLDm0ZIfVkJJPpovRb7YcCAmB0tlUojdU4H46trQzazHyYVyLqBXdLk4iogxpJASl2BeRezElfc2tdWR3lK0FtvOjoEqUrelVme04kSJ0wMvlylZkWQWPqGlbsnPaEpJjLWtrc27Vm9nDvRdgFtvOhjiqTWA16vKmtecRgbuZIF9n45mtnrZ4AbBdBTdge/hBzh1HtoVw85GaTVKBVfeXMWcfhX25HmwX5IAmwKfxnqm3JqvZ0Rjw/YZARKQMcjl5+H0CsJGesRwkZOQCBLVDshZ93sFo8v4Do3XuodaFg8ZGJDSTcawcthgh/MGM4KOIYd79q7Aq3I/8b9+STDy5JVgPyI0jQ6ftKc7EcAIwpq2cHuPbP+HgZXNbc7qI4HDvHa5YloEDUrIQbaP8qbwRHLZm6VTmGvVwLKwj6AZ0GNanrGO6</NextToken>
  </ListOrdersResult>
  <ResponseMetadata>
    <RequestId>f2b55344-d281-4bd3-b8b3-788be07b7656</RequestId>
  </ResponseMetadata>
</ListOrdersResponse>

I am using a python script to parse data from xml file. I want two fields from XML file AmazonOrderID and BuyerName. Some sub element in XML might not have have BuyerName. When I parse both individually, I get a list of 100 AmazonOrder and 70 BuyerName.

I want to get a empty string instead of nothing. i.e. if any subelement doesn't have a buyer name, i want to include '' instead of nothing.

My Code:

from xml.etree import ElementTree

with open('orders.xml', 'rb') as f:
    tree = ElementTree.parse(f)
ns = {'d': 'https://mws.amazonservices.com/Orders/2013-09-01'} 

for node in tree.findall('.//d:Order/d:AmazonOrderId', ns):
    oid.append(node.text)
for node in tree.findall('.//d:Order/d:BuyerName', ns):
    bn.append(node.text)

print oid
print bn

Upvotes: 1

Views: 268

Answers (1)

alecxe
alecxe

Reputation: 474181

You can make it in a single loop using findtext() specifying the default as an empty string:

for node in tree.findall('.//d:Order', namespaces=ns):
    oid.append(node.findtext("d:AmazonOrderId", default='', namespaces=ns))
    bn.append(node.findtext("d:BuyerName", default='', namespaces=ns))

Upvotes: 2

Related Questions