Reputation: 71
I'm having issues scraping data using my spider script even though a similar code works when I test it in the scrapy shell. The only difference is that in my script I am splitting the selector.
Here is the selector that works in the shell:
(//tr[position()>2]/td[position()=2])[1]
And here is the selector in the script:
def parse_forsale(self, response):
listingdata = response.xpath(".//tr[position()>2]") # < PART 1 OF SELECTOR
for data in listingdata:
A = data.xpath(".//td[position()=2][1]").get() # < PART 2 OF SELECTOR
B = data.xpath(".//td[position()=2][2]").get()
C = data.xpath(".//td[position()=2][3]").get()
D = data.xpath(".//td[position()=2][4]").get()
E = data.xpath(".//td[position()=2][5]").get()
F = data.xpath(".//td[position()=2][6]").get()
G = data.xpath(".//td[position()=2][7]").get()
H = data.xpath(".//td[position()=2][8]").get()
My educated guess for why this is not working is because when I'm using the selector in the shell I can put parentheses before the "//" and right before "[1]", which helps the selector work properly. However in the script I can't do this because I'm splitting the two components.
Any ideas on how I can get around this?
Thanks in advance for any help!
Upvotes: 0
Views: 155
Reputation: 10666
First of all there is a shorter way to write td[position()=2]
:
td[2]
Next what did you mean by this XPath:
.//td[position()=2][1]
Select td
at second position (position()=2
) that is at the first ([1]
)position at the same time?
UPDATE If you just want to process all rows after second and need to read td[2]
:
//tr[position() > 2]/td[2]
Upvotes: 2