user5656611
user5656611

Reputation:

Using tryCatch within plyr

I'm running this function:

require(XML)
require(plyr)


getKeyStats_xpath <- function(symbol) {
  yahoo.URL <- "http://finance.yahoo.com/q/ks?s="
  html_text <- htmlParse(paste(yahoo.URL, symbol, sep = ""), encoding="UTF-8")

  #search for <td> nodes anywhere that have class 'yfnc_tablehead1'
  nodes <- getNodeSet(html_text, "/*//td[@class='yfnc_tablehead1']")

  if(length(nodes) > 0 ) {
    measures <- sapply(nodes, xmlValue)

    #Clean up the column name
    measures <- gsub(" *[0-9]*:", "", gsub(" \\(.*?\\)[0-9]*:","", measures))   

    #Remove dups
    dups <- which(duplicated(measures))
    #print(dups) 
    for(i in 1:length(dups)) 
      measures[dups[i]] = paste(measures[dups[i]], i, sep=" ")

    #use siblings function to get value
    values <- sapply(nodes, function(x)  xmlValue(getSibling(x)))

    df <- data.frame(t(values))
    colnames(df) <- measures
    return(df)
  } else {
    break
  }
}

As long as the page exists, it works fine. However, if one of my tickers does NOT have any data on that URL, it throws an error:

Error in FUN(X[[3L]], ...) : no loop for break/next, jumping to top level 

I added a trace too, and things break down on ticker number 3.

tickers <- c("QLTI",
"RARE",
"RCPT",
"RDUS",
"REGN",
"RGEN",
"RGLS")

tryCatch({
stats <- ldply(tickers, getKeyStats_xpath)
}, finally={})

I'd like to call the function like this:

stats <- ldply(tickers, getKeyStats_xpath)
rownames(stats) <- tickers
write.csv(t(stats), "FinancialStats_updated.csv",row.names=TRUE)

Basically, if a ticker has no data, I want to skip it.

Can someone please help me get this working?

Upvotes: 1

Views: 612

Answers (1)

jaimedash
jaimedash

Reputation: 2743

Expanding on my comment. The issue here is you've enclosed the entire command stats <- ldply(tickers, getKeyStats_xpath) within a tryCatch. This means R will try to get key stats from every ticker.

Instead, what you want is to try each ticker.

To do this, write a wrapper for getKeyStats_xpath that encloses it in tryCatch. you could do this within ldply with an anonymous function, for example ldply(tickers, function (t) tryCatch(getKeyStats_xpath(t), finally={})). Note that finally executes regardless of exit condition, so finally={} executes nothing. (See Advanced R or How to write try catch in R from r-faq for more).

On an error, tryCatch calls the function provided in the argument error. So as is, this code still won't help as the error is unhandled (thanks to rawr for pointing this out earlier). It is also easier to inspect the output if you use llply instead, then

So a complete answer using this approach, and with informative error handling, is below.

stats <- llply(tickers, 
    function(t) tryCatch(getKeyStats_xpath(t), 
        error=function(x) {
            cat("error occurred for:\n", t, "\n...skipping this ticker\n")
        }
    )
)
names(stats) <- tickers
lapply(stats, length)
#<snip>
#$RCPT
#[1] 0
# </snip>

As of now, this works for me, returning data for all tickers except the one listed in the code block above.

Upvotes: 2

Related Questions