Jay1995
Jay1995

Reputation: 147

Webscraping In R - Getting "Error in records[[x]] .... more elements supplied than there are to replace"

Please note I am very new to web scraping in R and R itself so when explaining a response please be aware of this...

I am trying to web scrape the Date of Stay, the Title of the Review and the Review

This is where I generate the list of URL's I want to use:

library(rvest)
#GENERATING THE URLS
webpage_list <- vector(mode = "list")
#creating empty list
webpage_list

for(n in seq(from=5, to=15, by=5)){
  webpage_list[[n]] <- glue::glue("https://www.sampleURL.com#REVIEWS")
}

#droping the empty values
webpage_list[sapply(webpage_list,is.null)] <- NULL
webpage_list

Then convert the list to a character vector and iterate through start identifying the area on the webpage I want scraped

webpage_list2 <- unlist(webpage_list)
class(webpage_list2)

for(i in seq_along(webpage_list2)){
  webpage <- read_html(webpage_list2[i])

  results <- webpage %>% html_nodes(".oETBfkHU , ._3hDPbqWO")
  print(results)

  # Building the dataset
  records <- vector("character", length = (length(results)))
  print(records)
}

Seems to be working as I want (I think) up until this point

for (x in seq_along(results)) {
    url <- read_html(webpage_list2[x])
    dateOfStay <- str_c(url %>% 
                          html_nodes("._34Xs-BQm") %>% 
                          html_text())
    reviewTitle <- str_sub(url %>%
                             html_nodes(".glasR4aX")%>%
                             html_text())
    review <- str_sub(url %>%
                        html_nodes(".IRsGHoPm") %>%
                        html_text())
    records[[x]] <- data_frame(dateOfStay = dateOfStay, reviewTitle = reviewTitle, review = review)#, reviewTitle = reviewTitle, review = review
  }
#Build DF
DF <- bind_rows(records)

From this I get the below error:

Error in records[[x]] <- data_frame(dateOfStay = dateOfStay, reviewTitle = reviewTitle,  :    more elements supplied than there are to replace

Any help would be greatly appreciated and also Please note I am very new to web scraping in R and R itself so when explaining a response please be aware of this.

Upvotes: 0

Views: 105

Answers (1)

Adam Sampson
Adam Sampson

Reputation: 2021

Without scraping we can find your problem. You are trying to put a dataframe inside a character vector. A dataframe isn't a character. So it is the wrong dimensions. You can fix it by making records a list, or be wrapping your dataframe in a list to coerce it to a single item. I recommend making records a list.

records <- vector("character", length = (3))
records[[2]] <- data.frame(test = "A",test2 = "B")
# Error in records[[2]] <- data.frame(test = "A", test2 = "B") : 
#   more elements supplied than there are to replace

# Option 1:
records <- list(length = (3))                  
records[[2]] <- data.frame(test = "A",test2 = "B")
records
# $`length`
# [1] 3
# 
# [[2]]
#   test test2
# 1    A     B


# Option 2:
records <- vector("character", length = (3))
records[[2]] <- list(data.frame(test = "A",test2 = "B"))
# records
# [[1]]
# [1] ""
# 
# [[2]]
#   [[2]][[1]]
#   test test2
# 1    A     B
# 
# 
# [[3]]
# [1] ""

Upvotes: 1

Related Questions