Reputation: 149
I have been running this code at all times it worked for me, but suddenly it returns the following error:
Error in open.connection(con, "rb") : HTTP error 403
I haven't changed anything and I don't know why it could have happened. Any suggestion? Thank you!
#Loading the rvest package
library(rvest)
library(magrittr) # for the '%>%' pipe symbols
library(RSelenium) # to get the loaded html of
library(purrr) # for 'map_chr' to get reply
url_google <- list('https://play.google.com/store/apps/details?id=eu.acsi.europa&hl=es&gl=US&showAllReviews=true')
for (apps in url_google) {
#Specifying the url for desired website to be scraped
url <- apps
# starting local RSelenium (this is the only way to start RSelenium that is working for me atm)
selCommand <- wdman::selenium(jvmargs = c("-Dwebdriver.chrome.verboseLogging=true"), retcommand = TRUE)
shell(selCommand, wait = FALSE, minimized = TRUE)
remDr <- remoteDriver(port = 4567L, browserName = "firefox")
remDr$open()
require(RSelenium)
# go to website
remDr$navigate(url)
# get page source and save it as an html object with rvest
html_obj <- remDr$getPageSource(header = TRUE)[[1]] %>% read_html()
# 1) App name
app <- html_obj %>% html_nodes(".AHFaub") %>% html_text()
# 2) name field (assuming that with 'name' you refer to the name of the reviewer)
names <- html_obj %>% html_nodes(".kx8XBd .X43Kjb") %>% html_text()
Upvotes: 3
Views: 807
Reputation: 536
what worked for me was instead of your
remDr <- remoteDriver(port = 4567L, browserName = "firefox")
remDr$open()
I used
rD <- rsDriver(browser = "firefox",
check = FALSE
)
remDr <- rD[["client"]]
The rsDriver
command isn't the solution, but the argument check = FALSE
.
At least for me, this was a curl
issue, where it was trying to download new versions of each of the browser drivers and was having an issue. Turning check
to FALSE turns that download process off.
Upvotes: 2