Joseph Zhou
Joseph Zhou

Reputation: 315

Response is different from start url

I'm practicing xpath in scrapy shell. The wepage I work on is

http://bxt.harbin.gov.cn/more.php?nameid=1&frameid=1&colorid=1

I want to scrapy data in the table. But after I type

scrapy shell http://bxt.harbin.gov.cn/more.php?nameid=1&frameid=1&colorid=1

in Windows cmd, I find that under "Available Scrapy objects," there is

[s]   response   <200 http://bxt.harbin.gov.cn/more.php?nameid=0>

The response url is different from the url I want to work on. The wrong url does not have the data I aim to extract. Any idea why this is the case? Thanks!

Upvotes: 1

Views: 63

Answers (1)

alecxe
alecxe

Reputation: 473903

The desired table is located inside an iframe - go to the URL from where the iframe is loaded:

$ scrapy shell http://bxt.harbin.gov.cn/hrb_bzbxt/list_hf.php
In [1]: for row in response.xpath("//table[3]//tr[position() > 1]"):
    print row.xpath(".//td[1]/text()").extract()[0]
   ...:  
551626
551617
551616
551614
551612
551611
...
551521

In the demo above the contents of the first cell of each table row is printed.

Upvotes: 1

Related Questions