Marvin
Marvin

Reputation: 429

Scrapy scrape Apple site

I have tried to get the Models and also price from the link below, but no luck, would you please let me know what is wrong and how I can scrape these 2 part ?

https://www.apple.com/shop/buy-ipad/ipad-pro

here what I have tried: From $799

To Get Word "From"

response.xpath('//span[@class="as-price-currentprice"]/text()').extract()

[]


To get the price itself:

response.xpath('//span[@class="nowrap"]/text()').extract()

[u'1\u2011800\u2011MY\u2011APPLE.', u'1\u2011800\u2011MY\u2011APPLE.', u'Visit an ', u'call ', u', or ']


Model

By the way I am not able to get the models at all

11-inch iPad Pro

12.9-inch iPad Pro

Upvotes: 1

Views: 192

Answers (2)

Guillaume
Guillaume

Reputation: 1879

Look at the raw HTML that is being returned back by the website (right-click > View Source).

Raw HTML

As you can see, the page is just a template that is dynamically rendered by some Javascript code.

When you look at your web browser developer tools, the Javascript is already executed so you see the final rendered HTML, so make sure you look at the raw HTML.

Upvotes: 0

stasdeep
stasdeep

Reputation: 3146

This is how you can do that:

headers = response.css('.pd-billboard-subheader::text').getall()
prices = response.css('.pd-billboard-price::text').getall()

result = []
for header, price in zip(headers, prices):
    header_cleaned = header.replace('\xa0', ' ')
    price_cleaned = price.replace('\n', '').replace('        ', '').strip()
    result.append([header_cleaned, price_cleaned])

After this, result will be equal to something like:

[['12.9-inch iPad Pro', 'From $999'],
 ['11-inch iPad Pro', 'From $799'],
 ['10.5-inch iPad Pro', 'From $649'],
 ['iPad', 'From $329'],
 ['iPad mini 4', 'From $399']]

Upvotes: 2

Related Questions