Reputation:
from scrapy_selenium import SeleniumRequest
import scrapy
from selenium import webdriver
class testspider1(scrapy.Spider):
driver=webdriver.Firefox(executable_path=r"C:\Users\test\Desktop\geckodriver")
name = 'test5'
start_urls=['http://httpbin.org/ip']
def parse(self, response):
print(response.body)
url = "https://www.target.com/p/cesar-canine-cuisine-filet-mignon-flavor-wet-dog-food-3-5oz-tray/-/A-14903668"
yield SeleniumRequest(url=url,callback=self.parse_result)
def parse_result(self,response):
image = response.xpath('//*[@id="mainContainer"]/div/div/div[1]/div[1]/div[2]/div[1]/div/div/div/div/div/div/div/a/div/div/div/div/div/img/@src').extract_first()
price = response.selector.xpath('//*[@id="mainContainer"]/div/div/div[1]/div[2]/div/div[1]/span/text()').extract_first()
print(image)
print("\n\n")
print(price)
settings file:
from shutil import which
BOT_NAME = 'seleniumtest'
SPIDER_MODULES = ['seleniumtest.spiders']
NEWSPIDER_MODULE = 'seleniumtest.spiders'
SELENIUM_DRIVER_NAME = 'firefox'
SELENIUM_DRIVER_EXECUTABLE_PATH = which('geckodriver')
SELENIUM_BROWSER_EXECUTABLE_PATH = which(r"C:\Users\test\Desktop\geckodriver")
ROBOTSTXT_OBEY = True
DOWNLOADER_MIDDLEWARES = {
'scrapy_selenium.SeleniumMiddleware': 800
}
documentation on scrapy-selenium
I have followed the instructions step by step, but it the driver does not follow any links.I believe both requests are handled by scrapy. I don't want to change __init__
because I want some requests to be handled with scrapy-selenium others by scrapy(alone).
I checked passing-selenium-driver-to-scrapy but it changes the entire init
to make selenium as self.driver.
I want some requests to be handled by SeleniumRequest
others by scrapy Request
Note: I have used this site as example site that uses java to display results, if handled by scrapy (alone) data hasn't been rendered yet so empty lists will be the result
Upvotes: 1
Views: 936
Reputation: 11
I replaced firefox with chrome:
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())
Upvotes: 1