CS Student
CS Student

Reputation: 37

How to convert a CSS selector to XPath in Scrapy?

I want to convert a CSS selector to XPath in a Scrapy project.

I'm learning Scrapy from its website tutorial and I'm having trouble translating directly from CSS language to XPath.

The CSS selector used to parse http://quotes.toscrape.com/ is:

`>>> for quote in response.css("div.quote"):
...     text = quote.css("span.text::text").extract_first()
...     author = quote.css("small.author::text").extract_first()
...     tags = quote.css("div.tags a.tag::text").extract()
...     print(dict(text=text, author=author, tags=tags))`

I've tried writing using XPath as:

In [83]: for quote in response.xpath('//div[@class="quote"]'):
    ...:     text =    quote.xpath('//span[@class="text"]/text()').extract_first()
    ...:     author = quote.xpath('//small[@class="author"]/text()').extract_first()
    ...:     tags= quote.xpath('//div[@class="tags"]/a[@class="tag"]/text()').extract()
    ...:     print(dict(text=text,author=author,tags=tags))`

In the CSS path I get info on different quotes, while on XPath I get the same quote multiple times in the list. What am I doing wrong?

Upvotes: 1

Views: 248

Answers (1)

har07
har07

Reputation: 89285

"In the CSS path I get info on different quotes, while on XPath I get the same quote multiple times in the list. What am I doing wrong?"

The primary problem is due to the fact that XPath interprets / at the beginning of an expression as reference to root document, doesn't matter the context element at which the expression is executed. You want to explicitly tell that you want to execute the expression on current context element (the one referenced by variable quote) by adding a . at the beginning, for example:

text = quote.xpath('.//span[@class="text"]/text()').extract_first()

Upvotes: 2

Related Questions