Ernest Devenson
Ernest Devenson

Reputation: 21

How to extract full prices with scrapy?

Hi i am trying to scrap e-commerce page, but cant get prices.

I have page with this lines:

<span class="price">255,<sup>99</sup>€</span>
<span class="price">255 €</span>

But i can't extracts all price to one line.

I tried:

response.xpath('//span[@class="price"]/text()').extract()

But it ignores text in <sup> tag... What i am doing wrong? please help.

Upvotes: 1

Views: 1068

Answers (3)

usman Abbasi
usman Abbasi

Reputation: 107

Check source HTML. There is in the source:

I was searching for the same question for the whole day and find this answer perfect for this

response.xpath('//meta[@itemprop="price"]/@content').get()

Upvotes: 0

bbanzzakji
bbanzzakji

Reputation: 92

You should put double splash instead of single one.

response.xpath('//span[@class="price"]//text()').extract()

This statement returns all text under the specified tag as list object. Note that the returned list may have some useless elements just like empty or return carriage character. So you can use regex if you want extract only price information.

response.xpath('//span[@class="price"]//text()').re(r'[\d.,]+')

The currency symbol was ignored.

['255,','99','255']

Finally if you want get 255.99 from the page

''.join(response.xpath('//span[@class="price"][1]//text()').re(r'[\d.,]+')).replace(",",".")

You get all products first.

Final code:

products = response.xpath('//*[@class="catalog-table"]//td')
for prod in products:
    price = ''.join(prod.xpath('//span[@class="price"][1]//text()').re(r'[\d.,]+')).replace(",",".")
    print price

Upvotes: 1

JBJ
JBJ

Reputation: 1109

You need to add another slash before text. So it addresses ALL nodes.

    response.xpath('//span[@class="price"]//text()').extract()

Text='255,'
Text='99'
Text='€'

Upvotes: 1

Related Questions