Why do I get scrapy response empty?

Question

I started

scrapy shell -s USER_AGENT='Mozilla/5.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798

Next step

In [5]: response                                                                                                                                                                                            
Out[5]: <405 https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798>

After inspected page element,and copied XPath

In [6]: response.xpath('//*[@id="ad-title"]').extract()                                                                                                                                                     
Out[6]: []

Copy outerHTML

Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area

Image view response

Why?

Guillaume · Accepted Answer

Try to set the user agent to something more realistic, such as: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0.

Some websites do some basic validation on the user agent and redirect you to some special page if they detect something weird.

scrapy shell -s USER_AGENT='Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798
>>> response.xpath('//*[@id="ad-title"]').extract()
['Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area']
>>>

Why do I get scrapy response empty?

Answers (1)

Related Questions