Aditya Sharma
Aditya Sharma

Reputation: 351

Not able to extract data using scrapy with class names containing spaces and hyphens

I am new to scrapy and I have to extract text from a tag with multiple class names, where the class names contain spaces and hyphens.

Example:

<div class="info">
    <span class="price sale">text1</span>
    <span class="title ng-binding">some text</span>
</div>

When i use the code:

response.xpath("//span[contains(@class,'price sale')]/text()").extract()

I am able to get text1 but when I use:

response.xpath("//span[contains(@class,'title ng-binding')]/text()").extract()

I get an empty list. Why is this happening and how to handle this?

Upvotes: 7

Views: 10220

Answers (2)

Manthan Trivedi
Manthan Trivedi

Reputation: 99

You can replace the spaces with "." in your code when using response.css(). In your case you can try:

response.css("span.title.ng-binding::text").extract()

This code should return the text you are looking for.

Upvotes: 2

Umair Ayub
Umair Ayub

Reputation: 21341

The expression you're looking for is:

//span[contains(@class, 'title') and contains(@class, 'ng-binding')]

I highly suggest XPath visualizer, which can help you debug xpath expressions easily. It can be found here:

http://xpathvisualizer.codeplex.com/

Or with CSS try

response.css("span.title.ng-binding")

Or there is a chance that element with ng-binding is loaded via Javascript/Ajax hence not included in initial server response.

Upvotes: 9

Related Questions