Reputation:
I'm running this function:
require(XML)
require(plyr)
getKeyStats_xpath <- function(symbol) {
yahoo.URL <- "http://finance.yahoo.com/q/ks?s="
html_text <- htmlParse(paste(yahoo.URL, symbol, sep = ""), encoding="UTF-8")
#search for <td> nodes anywhere that have class 'yfnc_tablehead1'
nodes <- getNodeSet(html_text, "/*//td[@class='yfnc_tablehead1']")
if(length(nodes) > 0 ) {
measures <- sapply(nodes, xmlValue)
#Clean up the column name
measures <- gsub(" *[0-9]*:", "", gsub(" \\(.*?\\)[0-9]*:","", measures))
#Remove dups
dups <- which(duplicated(measures))
#print(dups)
for(i in 1:length(dups))
measures[dups[i]] = paste(measures[dups[i]], i, sep=" ")
#use siblings function to get value
values <- sapply(nodes, function(x) xmlValue(getSibling(x)))
df <- data.frame(t(values))
colnames(df) <- measures
return(df)
} else {
break
}
}
As long as the page exists, it works fine. However, if one of my tickers does NOT have any data on that URL, it throws an error:
Error in FUN(X[[3L]], ...) : no loop for break/next, jumping to top level
I added a trace too, and things break down on ticker number 3.
tickers <- c("QLTI",
"RARE",
"RCPT",
"RDUS",
"REGN",
"RGEN",
"RGLS")
tryCatch({
stats <- ldply(tickers, getKeyStats_xpath)
}, finally={})
I'd like to call the function like this:
stats <- ldply(tickers, getKeyStats_xpath)
rownames(stats) <- tickers
write.csv(t(stats), "FinancialStats_updated.csv",row.names=TRUE)
Basically, if a ticker has no data, I want to skip it.
Can someone please help me get this working?
Upvotes: 1
Views: 612
Reputation: 2743
Expanding on my comment. The issue here is you've enclosed the entire command stats <- ldply(tickers, getKeyStats_xpath)
within a tryCatch
. This means R will try to get key stats from every ticker.
Instead, what you want is to try each ticker.
To do this, write a wrapper for getKeyStats_xpath
that encloses it in tryCatch
. you could do this within ldply
with an anonymous function, for example ldply(tickers, function (t) tryCatch(getKeyStats_xpath(t), finally={}))
. Note that finally executes regardless of exit condition, so finally={} executes nothing. (See Advanced R or How to write try catch in R from r-faq for more).
On an error, tryCatch
calls the function provided in the argument error
. So as is, this code still won't help as the error is unhandled (thanks to rawr for pointing this out earlier). It is also easier to inspect the output if you use llply
instead, then
So a complete answer using this approach, and with informative error handling, is below.
stats <- llply(tickers,
function(t) tryCatch(getKeyStats_xpath(t),
error=function(x) {
cat("error occurred for:\n", t, "\n...skipping this ticker\n")
}
)
)
names(stats) <- tickers
lapply(stats, length)
#<snip>
#$RCPT
#[1] 0
# </snip>
As of now, this works for me, returning data for all tickers except the one listed in the code block above.
Upvotes: 2