Tuhina Singh
Tuhina Singh

Reputation: 1007

Scrapy, python: Unable to extract data using xpath seen in firebug

I am fairly new to web scraping, scrapy and python. Am trying to scrape data from this website page.

I want to extract email id given in the footer of the page: [email protected] and have tried using two xpaths to extract this in scrapy spider:

  1. Relative: id("gkFooterNav")/div/p/span/a/text()
  2. Absolute: /html/body/div[4]/div1/div/div/div/p/span/a/text()

I have tried these xpaths with and without the last component of 'text()'. None of these have worked and the spider returns a null list.

However, when I check these with xpath checker, I get the correct value. Unable to figure out what's going wrong here. Can anyone help please?

Thanks, Tuhina

Upvotes: 0

Views: 500

Answers (1)

GHajba
GHajba

Reputation: 3691

If you parse the site and look at the contents you see a message from the website:

This e-mail address is being protected from spambots. You need JavaScript enabled to view it.

So you need to execute the JavaScript to get access to the email address. Alternatively you could extract the email address from the JavaScript above this text and convert it accordingly -- without even executing any JavaScript.

Upvotes: 2

Related Questions