Reputation: 5081
I am using quantmod
to download option chains which come in the form of nested lists.
However for my purposes I would rather have the information in the form of a dataframe where the name of each list is contained in a column of the dataframe (thus two columns would be needed one containing the date of the strike of the option and the second the type of the option -- call or put).
How can this be accomplished in R?
For a reproducible example:
library(quantmod)
AAPL.2015 <- getOptionChain("AAPL", "2019/2021")
And if possible, what I should do to get the dates of the option strikes in English?
Upvotes: 2
Views: 1430
Reputation: 107767
Consider Map
to rbind
the individual calls and puts data frames adding needed indicator columns. Because the individual data frames resides within a nested, named list, the extract function, [
is used.
Then, run a final do.call
+ rbind
across resulting list of data frames. NOTE: rbind
assumes the call and put data frames maintain exact same names and number of columns.
call_put_func <- function(nm, call_df, put_df) {
cbind(rbind(transform(call_df, option_type = "call"),
transform(put_df, option_type = "put")
), date_of_strike = nm)
}
APPL_flat_df_list <- Map(call_put_func, nm = names(AAPL.2015),
call_df = lapply(AAPL.2015, "[[", "calls"),
put_df = lapply(AAPL.2015, "[[", "puts")
)
APPL_df <- do.call(rbind, unname(APPL_flat_df_list))
Upvotes: 1
Reputation: 16871
It might be more flexible to work with a couple functions from dplyr
and purrr
. dplyr::bind_rows
can take a list of data frames, and they can have different names, whereas the base rbind
just works on 2 data frames at once. bind_rows
also has an argument .id
that will create a column of list item names. purrr::map_dfr
calls a function over a list and returns a data frame of them all row-bound together; because it wraps around bind_rows
, it also has an .id
argument.
Having access to setting those IDs is helpful because you have 2 sets of IDs: one of dates, and one of calls vs puts. Setting one ID within the inner bind_rows
and one within map_dfr
gets both.
Written out with a function, to make it a little easier to see:
library(quantmod)
AAPL.2015 <- getOptionChain("AAPL", "2019/2021")
aapl_df <- purrr::map_dfr(AAPL.2015, function(d) {
dplyr::bind_rows(d, .id = "type")
}, .id = "date")
head(aapl_df)
#> date type Strike Last Chg Bid Ask Vol OI
#> 1 Sep.27.2019 calls 140 79.50 9.50 77.60 77.90 10 30
#> 2 Sep.27.2019 calls 145 75.85 0.00 72.70 73.30 NA 28
#> 3 Sep.27.2019 calls 150 72.22 0.00 67.85 67.90 10 91
#> 4 Sep.27.2019 calls 155 52.53 0.00 65.80 69.90 NA 10
#> 5 Sep.27.2019 calls 160 60.10 0.00 57.85 58.15 2 11
#> 6 Sep.27.2019 calls 165 54.40 15.95 52.65 52.90 9 16
Or in more common dplyr
piping with function shorthand notation:
library(dplyr)
aapl_df <- AAPL.2015 %>%
purrr::map_dfr(~bind_rows(., .id = "type"), .id = "date")
Upvotes: 1
Reputation: 444
I couldn't reproduce your example, but what you're trying to do is simple. You could use do.call
to call the rbind
function on the list and what you get at the end is a pretty dataframe.
list <- getOptionChain("AAPL", "2019/2021")
data <- do.call(rbind, list)
Upvotes: 2