Response is different from start url

Question

I'm practicing xpath in scrapy shell. The wepage I work on is

http://bxt.harbin.gov.cn/more.php?nameid=1&frameid=1&colorid=1

I want to scrapy data in the table. But after I type

scrapy shell http://bxt.harbin.gov.cn/more.php?nameid=1&frameid=1&colorid=1

in Windows cmd, I find that under "Available Scrapy objects," there is

[s]   response   <200 http://bxt.harbin.gov.cn/more.php?nameid=0>

The response url is different from the url I want to work on. The wrong url does not have the data I aim to extract. Any idea why this is the case? Thanks!

alecxe · Accepted Answer

The desired table is located inside an iframe - go to the URL from where the iframe is loaded:

$ scrapy shell http://bxt.harbin.gov.cn/hrb_bzbxt/list_hf.php
In [1]: for row in response.xpath("//table[3]//tr[position() > 1]"):
    print row.xpath(".//td[1]/text()").extract()[0]
   ...:  
551626
551617
551616
551614
551612
551611
...
551521

In the demo above the contents of the first cell of each table row is printed.

Response is different from start url

Answers (1)

Related Questions