Sameer Mittal
Sameer Mittal

Reputation: 15

why does this selector work in chrome but not in scrapy?

I am trying to scrape the stock name , its relevant news and time of news but scrapy doesn't return any output

class StationDetailSpider(CrawlSpider):
    name = 'tone'
    start_urls = ["http://www.moneycontrol.com/india/stockpricequote/auto-lcvs-hcvs/ashokleyland/AL"]

    def parse_news(self, response):
        for brickset in response.css:
            #TIME_SELECTOR = '//div.gD_10 ::text'
            NAME_SELECTOR = './/div[@class='b_42h1[@class='b_42'] PT5 PR']'
            #NEWS_SELECTOR = '//a.bl_13 ::text'
            yield {
                #'time': brickset.css(TIME_SELECTOR).extract_first(),
                #'news': brickset.css(NEWS_SELECTOR).extract_first(),
                'name': brickset.xpath(NAME_SELECTOR).extract_first(),
                 }

any kind of insight would be greatly appreciated. I have tried other formats but in vain.

Upvotes: 0

Views: 193

Answers (2)

Aurielle Perlmann
Aurielle Perlmann

Reputation: 5509

In this particular case - there is only 1 h1 tag so you can use the simple xpath of //h1/text()

Upvotes: 0

Casper
Casper

Reputation: 1435

Your xpath seems incorrect and I am wondering what you did in Chrome to let it find something at all.

Try this xpath:

//div[@class="b_42 PT5 PR"]/h1/text()

Assuming you wish to scrape

Ashok Leyland

Upvotes: 1

Related Questions