Reputation: 4807
I am using the following code:
url = "http://finance.yahoo.com/q/op?s=DIA&m=2013-07"
library(XML)
tabs = readHTMLTable(url, stringsAsFactors = F)
I get the following error:
Error: failed to load external entity "http://finance.yahoo.com/q/op?s=DIA&m=2013-07"
When I use the url in the browser it works fine. So, what am I doing incorrect here?
Thanks
Upvotes: 8
Views: 17401
Reputation: 23
I just got the same error as above "failed to load external entity" when using url <- "http://www.cisco.com/c/en/us/products/a-to-z-series-index.html" doc <- htmlTreeParse(url, useInternal=TRUE)
I came across this and another post on the topic, which didn't solve my problem. This code worked before. I then realized that I was on corporate VPN. I got off the VPN and tried again and it worked. So, being on VPN might be another reason why you would get the above error. Getting off VPN solves it.
Upvotes: 0
Reputation: 3601
It's difficult to know for sure since I can't replicate your error, but according the package's author (see http://comments.gmane.org/gmane.comp.lang.r.mac/2284), XML's methods for getting web content are pretty minimalistic. A workaround is to use RCurl
to get the content and XML
to parse it:
library(XML)
library(RCurl)
url <- "http://finance.yahoo.com/q/op?s=DIA&m=2013-07"
tabs <- getURL(url)
tabs <- readHTMLTable(tabs, stringsAsFactors = F)
Or, if RCurl
still throws an error, try the httr
package:
library(httr)
tabs <- GET(url)
tabs <- readHTMLTable(rawToChar(tabs$content), stringsAsFactors = F)
Upvotes: 16