Subsetting within user-defined function using dplyr

Question

I have financial data with multiple tickers and monthly returns data. I would like to create a function to calculate the Sharpe Ratio just by providing the ticker as a string.

library(dplyr)
library(reshape2)
library(lubridate)

# Define function to calculate sharpe ratio
sharpe_ratio <- function(ticker)
{
  # Create data frame containing annualized returns for each year for ticker 
  # of interest
  df1 %>% subset(ticker == ticker) %>%
     group_by(year(date)) %>% 
      summarize(annual.return = prod(1 + mret.excess) - 1) %>% 
      as.data.frame -> annualized_returns

  # Calculate Sharpe Ratio with annualized metrics
  mu <- mean(annualized_returns$annual.return)
  sigma <- sd(annualized_returns$annual.return)
  return (mu/sigma)
}

However, when I try:

sharpe_ratio("YACKX")

it does not subset the data frame (df1) at all, and I am left with the mean annualized return for all returns irrespective of ticker input. However, if in the function I replace

... %>% subset(ticker == ticker) %>% ...

with

... %>% subset(ticker == "YACKX") %>% ...

the function now properly subsets my dataframe. I am curious as to why when I try subsetting the dataframe with a formal argument it does not work, but when I "fix" what ticker to subset by typing the ticker name in quotes it behaves properly.

Here is an example dataset:

date <- as.Date(c("2000-01-31", "2000-02-29", "2000-03-31", "2000-04-30", "2000-05-31", 
                     "2000-06-30", "2000-07-31", "2000-08-31", "2000-09-30", "2000-10-31", 
                     "2000-11-30", "2000-12-31", "2001-01-31", "2001-02-28", "2001-03-31", 
                     "2001-04-30", "2001-05-31", "2001-06-30", "2001-07-31", "2001-08-31", 
                     "2001-09-30", "2001-10-31", "2001-11-30","2001-12-31", "2000-01-31", 
                     "2000-02-29", "2000-03-31", "2000-04-30", "2000-05-31", "2000-06-30", 
                     "2000-07-31", "2000-08-31", "2000-09-30", "2000-10-31", "2000-11-30", 
                     "2000-12-31", "2001-01-31", "2001-02-28", "2001-03-31", "2001-04-30",
                     "2001-05-31", "2001-06-30", "2001-07-31", "2001-08-31", "2001-09-30", 
                     "2001-10-31", "2001-11-30","2001-12-31"))
tickers <- c(rep("YACKX",24), rep("APIMX",24))
mret.excess <- c(-0.0743128, -0.0798149,  0.0571812, -0.0408150,  0.0277273,  0.0535117, 
                 -0.0181185,  0.0591170, -0.0019288, 0.0786993,  0.0017027,  0.0220814,  
                 0.0170490,  0.0061800, -0.0368087,  0.0216363,  0.0356446, -0.0066351,
                 0.0335736,  0.0006140, -0.0795808,  0.0238521, 0.1076750,  0.0290756, 
                 -0.0566304,  0.0328873,  0.0552739, -0.0458054, -0.0402790,  0.0265851, 
                 -0.0344774,  0.0860904, -0.0575071, -0.0814842, -0.0872155, 0.0028902,
                 0.0470691, -0.1203689, -0.0896772,  0.0995483, -0.0048447, -0.0242168, 
                 -0.0257273, -0.0711448, -0.1155542, 0.0540500,  0.0880436,  0.0202195)
df1 <- data.frame(date_ex,tickers_ex,returns_ex ,stringsAsFactors = FALSE)

For YACKX my Sharpe ratio output should be [1] 1.997946 but I get [1] -1.186262. Similarly, for APIMX my Sharpe ratio output should be [1] -7.231879 but I get -1.186262. Thus, I know the data is not being subset correctly.

phiver · Accepted Answer

You have a few mistakes in your function. Instead of subset, just use filter instead of subset and you have the column name tickers in df1, not ticker. See below.

library(dplyr)
library(lubridate)
sharpe_ratio <- function(ticker)
{
  # Create data frame containing annualized returns for each year for ticker 
  # of interest
  df1 %>% filter(tickers == ticker) %>%
    group_by(year(date)) %>% 
    summarize(annual.return = prod(1 + mret.excess) - 1) %>% 
    as.data.frame -> annualized_returns

  # Calculate Sharpe Ratio with annualized metrics
  mu <- mean(annualized_returns$annual.return)
  sigma <- sd(annualized_returns$annual.return)
  return (mu/sigma)
}

sharpe_ratio("YACKX")
[1] 1.997946
sharpe_ratio("APIMX")
[1] -7.231879

Subsetting within user-defined function using dplyr

Answers (1)

Related Questions