Jeisson
Jeisson

Reputation: 25

Error: Replacement has length zero R

I 'm trying to scrape http://then.gasbuddy.com/.

I'm running the next code in R

 library(RCurl)
 library(XML)
 doc <- htmlTreeParse('http://www.southcarolinagasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=0&area=All%20Areas&station=All%20Stations&tme_limit=4')
rootNode <- xmlRoot(doc)

((rootNode[[2]][4])[1][[1]])[[15]][[1]][[11]][[1]][[1]][[2]][[8]][[1]][[2]][[1]][[1]][[1]][[1]][[1]][[1]]

#<div class="p1"/>

x <- matrix(, nrow = 20, ncol = 4)

x[1,1] <- xmlValue(((rootNode[[2]][4])[1][[1]])[[15]][[1]][[11]][[1]][[1]][[2]][[8]][[1]][[2]][[1]][[1]][[1]][[1]][[1]][[1]])

But I have this error

replacement has length zero

How can I subtract p1 and put it in a matrix?

Upvotes: 0

Views: 924

Answers (2)

hrbrmstr
hrbrmstr

Reputation: 78832

You've come up with an interesting way to get around their price obfuscation. Since they didn't restrict scraping in their Terms of Service, here's one way you can scrape the prices:

library(xml2)

doc <- read_html('http://www.southcarolinagasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=0&area=All%20Areas&station=All%20Stations&tme_limit=4')

prices <- xml_find_all(doc, xpath="//div[@class='sp_p']")

sapply(prices, function(x) {
  as.numeric(paste(gsub("d", "\\.", 
                        gsub("^p", "", 
                             unlist(xml_attrs(xml_find_all(x, "./div"))))),
                   collapse=""))
})

##   [1] 1.65 1.65 1.65 1.65 1.65 1.65 1.65 1.65 1.65 1.67 1.68 1.69 1.69 1.69 1.69 1.69 1.69 1.69 1.69
##  [20] 1.70 1.71 1.72 1.72 1.73 1.73 1.73 1.73 1.73 1.73 1.73 1.73 1.73 1.74 1.74 1.74 1.74 1.74 1.74
##  [39] 1.74 1.74 1.74 1.75 1.75 1.75 1.75 1.75 1.75 1.75 1.75 1.75 1.75 1.75 1.75 1.75 1.75 1.76 1.76
##  [58] 1.76 1.76 1.76 1.76 1.76 1.76 1.76 1.76 1.76 1.76 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77
##  [77] 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77
##  [96] 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.77 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78
## [115] 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78
## [134] 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.78 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79
## [153] 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79
## [172] 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79
## [191] 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79 1.79

Upvotes: 2

shadowtalker
shadowtalker

Reputation: 13903

The error means what it says. Look at the return value from

xmlValue(((rootNode[[2]][4])[1][[1]])[[15]][[1]][[11]][[1]][[1]][[2]][[8]][[1]][[2]][[1]][[1]][[1]][[1]][[1]][[1]])

It's

character(0)

Because <div class="p1"/> is a self-closing tag that doesn't contain any text. As the error message indicates, it's an error in R to replace part of a vector with something that has length zero. If you want these length-zero results to return something like NA or "", you need to use an if/else construction.

Upvotes: 1

Related Questions