Tomas Ericsson
Tomas Ericsson

Reputation: 347

No data when scraping with rvest

I am trying to scrape a website but it does not give me any data.

#Get the Data
require(tidyverse)
require(rvest)

#specify the url
url <- 'https://www.travsport.se/sresultat?kommando=tevlingsdagVisa&tevdagId=570243&loppId=0&valdManad&valdLoppnr&source=S'

#get data
url %>%
  read_html() %>% 
  html_nodes(".green div:nth-child(1)") %>% 
  html_text()
character(0)

I have also tried to use the xpath = '//*[contains(concat( " ", @class, " " ), concat( " ", "green", " " ))]//div[(((count(preceding-sibling::*) + 1) = 1) and parent::*)]//a' but this gives me the same result with 0 data.

I am expecting Horse names. Shouldnt I at least get some javascript code even if data on page is rendered by javascript?

I cant see what else CSS selector I should use here.

Upvotes: 2

Views: 708

Answers (1)

Guillaume Ottavianoni
Guillaume Ottavianoni

Reputation: 496

You can simply use RSelenium package to scrape dynamycal pages :

library(RSelenium)
#specify the url
url <- 'https://www.travsport.se/sresultat?kommando=tevlingsdagVisa&tevdagId=570243&loppId=0&valdManad&valdLoppnr&source=S'

#Create the remote driver / navigator
rsd <- rsDriver(browser = "chrome")
remDr <- rsd$client

#Go to your url
remDr$navigate(url)
page <- read_html(remDr$getPageSource()[[1]])

#get your horses data by parsing Selenium page with Rvest as you know to do
page %>% html_nodes(".green div:nth-child(1)") %>% html_text()

Hope that will helps

Gottavianoni

Upvotes: 2

Related Questions