Saad Rehman
Saad Rehman

Reputation: 403

Rvest is unable to find the node specified by css selector, how do I fix it?

I am scraping data from this website and for some reason, I'm unable to get the name of the seller, even though I use the exact node returned by SelectorGadget. I have, however, managed to get all the other data with Rvest.

I managed to scrape the seller's name with RSelenium but that takes too much time. Anyway, here's the link of the page I'm scraping:

https://www.kijiji.ca/v-fitness-personal-trainer/bedford/swimming-lessons/1421292946

Here's the code I've used

SellerName <-
  read_html("https://kijiji.ca/v-fitness-personal-trainer/bedford/swimming-lessons/1421292946") %>%
  html_nodes(".link-4200870613") %>%
  html_text()

Upvotes: 0

Views: 134

Answers (1)

QHarr
QHarr

Reputation: 84465

You can regex out the seller name easily from the return as it is contained in a script tag (presumably loaded from here when browser is able to run javascript - which rvest does not.)

library(rvest)
library(magrittr)
library(stringr)

p <- read_html('https://www.kijiji.ca/v-fitness-personal-trainer/bedford/swimming-lessons/1421292946') %>% html_text()
seller_name <- str_match_all(p,'"sellerName":"(.*?)"')[[1]][,2][1]
print(seller_name)

Regex:

enter image description here

Upvotes: 1

Related Questions