Reputation: 27
This is my XML file:
<?xml version="1.0" ?>
<Items>
<Item>
<ASIN>3570102769</ASIN>
<DetailPageURL>http://www.amazon.de/Inside-IS-Tage-Islamischen-Staat/dp/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3D3570102769</DetailPageURL>
<ItemLinks>
<ItemLink>
<Description>Add To Wishlist</Description>
<URL>http://www.amazon.de/gp/registry/wishlist/add-item.html%3Fasin.0%3D3570102769%26SubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
</ItemLink>
<ItemLink>
<Description>Tell A Friend</Description>
<URL>http://www.amazon.de/gp/pdp/taf/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
</ItemLink>
<ItemLink>
<Description>All Customer Reviews</Description>
<URL>http://www.amazon.de/review/product/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
</ItemLink>
<ItemLink>
<Description>All Offers</Description>
<URL>http://www.amazon.de/gp/offer-listing/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
</ItemLink>
</ItemLinks>
<ItemAttributes>
<Author>Jürgen Todenhöfer</Author>
<Binding>Gebundene Ausgabe</Binding>
<EAN>9783570102763</EAN>
<EANList>
<EANListElement>9783570102763</EANListElement>
</EANList>
<ISBN>3570102769</ISBN>
<IsEligibleForTradeIn>1</IsEligibleForTradeIn>
<ItemDimensions>
<Height Units="hundredths-inches">874</Height>
<Length Units="hundredths-inches">575</Length>
<Width Units="hundredths-inches">126</Width>
</ItemDimensions>
<Label>C. Bertelsmann Verlag</Label>
<Languages>
<Language>
<Name>Deutsch</Name>
<Type>Published</Type>
</Language>
<Language>
<Name>Deutsch</Name>
<Type>Original</Type>
</Language>
<Language>
<Name>Deutsch</Name>
<Type>Unbekannt</Type>
</Language>
</Languages>
<ListPrice>
<Amount>1799</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 17,99</FormattedPrice>
</ListPrice>
<Manufacturer>C. Bertelsmann Verlag</Manufacturer>
<ManufacturerMinimumAge Units="months">192</ManufacturerMinimumAge>
<NumberOfPages>288</NumberOfPages>
<PackageDimensions>
<Height Units="hundredths-inches">118</Height>
<Length Units="hundredths-inches">567</Length>
<Weight Units="hundredths-pounds">93</Weight>
<Width Units="hundredths-inches">252</Width>
</PackageDimensions>
<PackageQuantity>1</PackageQuantity>
<ProductGroup>Book</ProductGroup>
<ProductTypeName>ABIS_BOOK</ProductTypeName>
<PublicationDate>2015-04-27</PublicationDate>
<Publisher>C. Bertelsmann Verlag</Publisher>
<Studio>C. Bertelsmann Verlag</Studio>
<Title>Inside IS - 10 Tage im 'Islamischen Staat'</Title>
<TradeInValue>
<Amount>930</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 9,30</FormattedPrice>
</TradeInValue>
</ItemAttributes>
<OfferSummary>
<LowestNewPrice>
<Amount>1799</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 17,99</FormattedPrice>
</LowestNewPrice>
<LowestUsedPrice>
<Amount>1390</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 13,90</FormattedPrice>
</LowestUsedPrice>
<LowestCollectiblePrice>
<Amount>4999</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 49,99</FormattedPrice>
</LowestCollectiblePrice>
<TotalNew>56</TotalNew>
<TotalUsed>8</TotalUsed>
<TotalCollectible>1</TotalCollectible>
<TotalRefurbished>0</TotalRefurbished>
</OfferSummary>
<Offers>
<TotalOffers>1</TotalOffers>
<TotalOfferPages>1</TotalOfferPages>
<MoreOffersUrl>http://www.amazon.de/gp/offer-listing/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</MoreOffersUrl>
<Offer>
<OfferAttributes>
<Condition>New</Condition>
</OfferAttributes>
<OfferListing>
<OfferListingId>9KHCZj9qtL6ucVBPASfXaryQjU8tWbc0n%2F3F4F7GraOKW6Csji2OxpD93%2FkoHwgIGQctlnrtx4RWIeJULAcvvsFhiopFi08JdsZ%2FeO3u6g0%3D</OfferListingId>
<Price>
<Amount>1799</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 17,99</FormattedPrice>
</Price>
<Availability>Gewöhnlich versandfertig in 24 Stunden</Availability>
<AvailabilityAttributes>
<AvailabilityType>now</AvailabilityType>
<MinimumHours>0</MinimumHours>
<MaximumHours>0</MaximumHours>
</AvailabilityAttributes>
<IsEligibleForSuperSaverShipping>1</IsEligibleForSuperSaverShipping>
</OfferListing>
</Offer>
</Offers>
</Item>
<Item>
<ASIN>3813506479</ASIN>
<DetailPageURL>http://www.amazon.de/Altes-Land-Roman-D%C3%B6rte-Hansen/dp/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3D3813506479</DetailPageURL>
<ItemLinks>
<ItemLink>
<Description>Add To Wishlist</Description>
<URL>http://www.amazon.de/gp/registry/wishlist/add-item.html%3Fasin.0%3D3813506479%26SubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
</ItemLink>
<ItemLink>
<Description>Tell A Friend</Description>
<URL>http://www.amazon.de/gp/pdp/taf/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
</ItemLink>
<ItemLink>
<Description>All Customer Reviews</Description>
<URL>http://www.amazon.de/review/product/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
</ItemLink>
<ItemLink>
<Description>All Offers</Description>
<URL>http://www.amazon.de/gp/offer-listing/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
</ItemLink>
</ItemLinks>
<ItemAttributes>
<Author>Dörte Hansen</Author>
<Binding>Gebundene Ausgabe</Binding>
<EAN>9783813506471</EAN>
<EANList>
<EANListElement>9783813506471</EANListElement>
</EANList>
<ISBN>3813506479</ISBN>
<IsEligibleForTradeIn>1</IsEligibleForTradeIn>
<ItemDimensions>
<Height Units="hundredths-inches">870</Height>
<Length Units="hundredths-inches">567</Length>
<Width Units="hundredths-inches">114</Width>
</ItemDimensions>
<Label>Albrecht Knaus Verlag</Label>
<Languages>
<Language>
<Name>Deutsch</Name>
<Type>Published</Type>
</Language>
<Language>
<Name>Deutsch</Name>
<Type>Original</Type>
</Language>
</Languages>
<ListPrice>
<Amount>1999</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 19,99</FormattedPrice>
</ListPrice>
<Manufacturer>Albrecht Knaus Verlag</Manufacturer>
<NumberOfPages>288</NumberOfPages>
<PackageDimensions>
<Height Units="hundredths-inches">118</Height>
<Length Units="hundredths-inches">858</Length>
<Weight Units="hundredths-pounds">101</Weight>
<Width Units="hundredths-inches">559</Width>
</PackageDimensions>
<ProductGroup>Book</ProductGroup>
<ProductTypeName>ABIS_BOOK</ProductTypeName>
<PublicationDate>2015-02-16</PublicationDate>
<Publisher>Albrecht Knaus Verlag</Publisher>
<Studio>Albrecht Knaus Verlag</Studio>
<Title>Altes Land: Roman</Title>
<TradeInValue>
<Amount>965</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 9,65</FormattedPrice>
</TradeInValue>
</ItemAttributes>
<OfferSummary>
<LowestNewPrice>
<Amount>1999</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 19,99</FormattedPrice>
</LowestNewPrice>
<LowestUsedPrice>
<Amount>1599</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 15,99</FormattedPrice>
</LowestUsedPrice>
<TotalNew>72</TotalNew>
<TotalUsed>8</TotalUsed>
<TotalCollectible>0</TotalCollectible>
<TotalRefurbished>0</TotalRefurbished>
</OfferSummary>
<Offers>
<TotalOffers>1</TotalOffers>
<TotalOfferPages>1</TotalOfferPages>
<MoreOffersUrl>http://www.amazon.de/gp/offer-listing/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</MoreOffersUrl>
<Offer>
<OfferAttributes>
<Condition>New</Condition>
</OfferAttributes>
<OfferListing>
<OfferListingId>aeRv5KPt26T8S0hLrgV8Bv9UPYABYOMijGRxffbNJXUZSN4XfeeOZZpCZ28EURzmgMLlcYEBSRlMXS%2F8Z0pN1JbYerndME%2B2VK3RosfdQJA%3D</OfferListingId>
<Price>
<Amount>1999</Amount>
<CurrencyCode>EUR</CurrencyCode>
<FormattedPrice>EUR 19,99</FormattedPrice>
</Price>
<Availability>Gewöhnlich versandfertig in 24 Stunden</Availability>
<AvailabilityAttributes>
<AvailabilityType>now</AvailabilityType>
<MinimumHours>0</MinimumHours>
<MaximumHours>0</MaximumHours>
</AvailabilityAttributes>
<IsEligibleForSuperSaverShipping>1</IsEligibleForSuperSaverShipping>
</OfferListing>
</Offer>
</Offers>
</Item>
</Items>
I want to get any ASIN element. So I tried this:
from lxml import etree
doc = etree.fromstring(xmlstring)
items = doc.xpath('//Items/Item')
for a in items:
asin = a.xpath('//ASIN/text()')
print asin
What I get is this:
['3570102769', '3813506479']
['3570102769', '3813506479']
But I want this:
['3570102769']
['3813506479']
I don't understand what's the problem here? I think I should iterate over any element and in every element is one item with one asin. Why does it return two times two asin?
Upvotes: 2
Views: 104
Reputation: 2053
When you're searching for a.xpath('//ASIN/text()')
you're searching the complete document tree again. Quoting from the XML Path language specification:
//para
selects all the para descendants of the document root and thus selects all para elements in the same document as the context node
So what you're doing is iterating over the matched Item
nodes and saying "Give me all ASIN nodes in this document please". The context for this (the Item
node) is ignored.
What you should do instead, is directly select the ASIN child-node directly. Keeping to your original implementation this could look like this:
doc = etree.fromstring(xmlstring)
items = doc.xpath('//Items/Item')
for a in items:
asin = a.xpath('ASIN/text()')
print asin
which gives the output you desire:
['3570102769']
['3813506479']
Alternatively, if you're not certain where in the Item
node your ASIN
appears, you could use .//ASIN/text()
Upvotes: 2