dev-x
dev-x

Reputation: 899

Get text from multiple classes in scrapy

I want to crawl data from a website. I use this code

import scrapy 

class KamusSetSpider(scrapy.Spider):
    name = "kamusset_spider"
    start_urls = ['http://kbbi.web.id/abadi']

    def parse(self, response):
        SET_SELECTOR = '.tur highlight'
        for brickset in response.css(SET_SELECTOR):
            yield {
                'name': brickset.css(SET_SELECTOR).extract_first(),
            }

and this is the inspect element:

enter image description here

I want to get every text in the red oval, like mengabadi, mengabadikan, etc. There are multiple class in the 'b' tag => tur highlight. But, I have not get any result.enter image description here

What's the problem? How to solve it? I have change my code become this:

def parse(self, response):
        for kamusset in response.css("div#d1"):
            text = kamusset.css("div.sub_17 b.tur.highlight::text").extract()
            print(dict(text=text))

but still not working. It return null.

Upvotes: 3

Views: 7232

Answers (3)

Dejavu
Dejavu

Reputation: 31

As far I understood your question, you want to extract text from the different place (tags) with different class names in single css_selector.

text = kamusset.css("div.sub_17::text, b.tur.highlight::text").extract()

This will work, surely

Upvotes: 2

vijay athithya
vijay athithya

Reputation: 1529

This will do!!

text : Selector(text=response.body).xpath('//b[@class="tur highlight"]/text()').extract()

Will return list with all occurrences.

Upvotes: 0

Eugene Lisitsky
Eugene Lisitsky

Reputation: 12845

Selector .tur highlight means - select elements highlight inside all elements with class tur.

To select elements with multiple classes use selector without whitespaces :

SET_SELECTOR = '.tur.highlight'

Upvotes: 4

Related Questions