Scrapy returning same first row data in each row instead of separate data for each row

Question

I have written a simple scrape using scrapy, but it keeps returning the first instance of the target data instead of the correct data in each row from each instance of target data. In this case, it returns the first link for all scraped jobs from the Indeed website, instead of the correct link for each job.

I've tried both using (div) and avoiding (.//div) absolute paths, as well as using [0] at the end of the lin. Without, [0], it returns all data from all rows in each cell.

Link to example of source data is; https://www.indeed.co.uk/jobs?as_and=a&as_phr=&as_any=&as_not=IT+construction&as_ttl=Project+Manager&as_cmp=&jt=contract&st=&salary=%C2%A330K-%C2%A3460K&radius=25&fromage=2&limit=50&sort=date&psf=advsrch

Target data is href="/rc/clk?jk=56e4f5164620b6da&fccid=6920a3604c831610&vjs=3"

Target data from page


    
        Project Manager

Here's my code

def parse(self, response):
    titles = response.css('div.jobsearch-SerpJobCard')
    items = []
    for title in titles:
        item = ICcom4Item()
        home_url = ("http://www.indeed.co.uk")
        item ['role_title_link'] = titles.xpath('div[@class="title"]/a/@href').extract()[0] 

        items.append(item)
    return items

I just need the correct link from each job to appear. All help welcome!

reisdev · Accepted Answer

The problem is in the line below:

item ['role_title_link'] = titles.xpath('div[@class="title"]/a/@href').extract()[0]

Instead of titles.xpath, you should use title.xpath, like below:

item ['role_title_link'] = title.xpath('div[@class="title"]/a/@href').extract()[0]

Then, your code will scrape the link for each job, as you want.

Scrapy returning same first row data in each row instead of separate data for each row

Target data from page

Here's my code

Answers (1)

Related Questions