YASPLS
YASPLS

Reputation: 71

Scrapy - Splitting selector parts between two variables

I'm having issues scraping data using my spider script even though a similar code works when I test it in the scrapy shell. The only difference is that in my script I am splitting the selector.

Here is the selector that works in the shell:

(//tr[position()>2]/td[position()=2])[1]

And here is the selector in the script:

def parse_forsale(self, response):
        listingdata = response.xpath(".//tr[position()>2]")  # < PART 1 OF SELECTOR
        for data in listingdata:
            A = data.xpath(".//td[position()=2][1]").get() # < PART 2 OF SELECTOR
            B = data.xpath(".//td[position()=2][2]").get()
            C = data.xpath(".//td[position()=2][3]").get()
            D = data.xpath(".//td[position()=2][4]").get()
            E = data.xpath(".//td[position()=2][5]").get()
            F = data.xpath(".//td[position()=2][6]").get()
            G = data.xpath(".//td[position()=2][7]").get()
            H = data.xpath(".//td[position()=2][8]").get()

My educated guess for why this is not working is because when I'm using the selector in the shell I can put parentheses before the "//" and right before "[1]", which helps the selector work properly. However in the script I can't do this because I'm splitting the two components.

Any ideas on how I can get around this?

Thanks in advance for any help!

Upvotes: 0

Views: 155

Answers (1)

gangabass
gangabass

Reputation: 10666

First of all there is a shorter way to write td[position()=2]:

td[2]

Next what did you mean by this XPath:

.//td[position()=2][1]

Select td at second position (position()=2) that is at the first ([1])position at the same time?

UPDATE If you just want to process all rows after second and need to read td[2]:

//tr[position() > 2]/td[2]

Upvotes: 2

Related Questions