Reputation: 73
Hi I have a list SPX400
I have already used lapply to subset: SPX400<- lapply(ALL400, function(x) x[x$ticker=="SPX",])
However I now want to filter my dataset even more and have tried to set up the following, defining a function to be used by lapply:
Usedfilter <- function(x) {
mutate(fifteen = x[x$expirDate-x$trade_date-15]) %>% #making new vairable
filter(fifteen >=0) %>% # filter on fifteen
mutate(Stockandexpricedif = x[x$stkPx-x$strike]) %>% #making new variable
filter(Stockandexpricedif < 10 & Stockandexpricedif > -10) # filter on Stockandexpricedif
}
SPXfilter <- lapply(SPX400, Usedfilter)
I get the following error:
Error in is.data.frame(.data) :
argument ".data" is missing, with no default
Hope you can help me out.
This is the structure of SPX400:
$ : tibble [2,498 × 37] (S3: tbl_df/tbl/data.frame)
..$ ticker : chr [1:2498] "SPX" "SPX" "SPX" "SPX" ...
..$ stkPx : num [1:2498] 1923 1923 1923 1923 1923 ...
..$ expirDate : Date[1:2498], format: "2014-08-01" "2014-08-01" "2014-08-01" "2014-08-01" ...
..$ yte : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ strike : num [1:2498] 1350 1375 1400 1425 1450 ...
..$ cVolu : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ cOi : num [1:2498] 0 0 0 0 30 0 0 0 0 0 ...
..$ pVolu : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ pOi : num [1:2498] 770 3573 2246 7984 20967 ...
..$ cBidPx : num [1:2498] 569 544 519 494 469 ...
..$ cValue : num [1:2498] 574 549 524 499 474 ...
..$ cAskPx : num [1:2498] 581 556 531 506 481 ...
..$ pBidPx : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ pValue : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ pAskPx : num [1:2498] 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 ...
..$ cBidIv : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ cMidIv : num [1:2498] 0.00542 0.00567 0.00567 0 0.00567 0.00564 0.00567 0.00567 0.00567 0.00567 ...
..$ cAskIv : num [1:2498] 0.0108 0.0113 0.0113 0 0.0113 ...
..$ smoothSmvVol : num [1:2498] 0.321 0.327 0.333 0.333 0.235 ...
..$ pBidIv : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ pMidIv : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ pAskIv : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ iRate : num [1:2498] 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 ...
..$ divRate : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ residualRateData: num [1:2498] -10 -10 -10 -10 -10 ...
..$ delta : num [1:2498] 1 1 1 1 1 1 1 1 1 1 ...
..$ gamma : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ theta : num [1:2498] -138 -141 -143 -146 -148 ...
..$ vega : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ rho : num [1:2498] 0.000257 0.000261 0.000266 0.000271 0.000276 ...
..$ phi : num [1:2498] -0.000366 -0.000366 -0.000366 -0.000366 -0.000366 ...
..$ driftlessTheta : num [1:2498] -0.000975 -0.000993 -0.001011 -0.00103 -0.001047 ...
..$ extVol : num [1:2498] 0.102 0.102 0.271 0.271 0.102 ...
..$ extCTheo : num [1:2498] 593 568 543 518 493 ...
..$ extPTheo : num [1:2498] 0 0 0 0 0 0 0 0 0 0 ...
..$ spot_px : logi [1:2498] NA NA NA NA NA NA ...
..$ trade_date : Date[1:2498], format: "2014-08-01" "2014-08-01" "2014-08-01" "2014-08-01" ...
..- attr(*, "problems")= tibble [9,071 × 5] (S3: tbl_df/tbl/data.frame)
.. ..$ row : int [1:9071] 39322 39323 39324 39325 39326 39327 39328 39329 39330 39331 ...
.. ..$ col : chr [1:9071] "spot_px" "spot_px" "spot_px" "spot_px" ...
.. ..$ expected: chr [1:9071] "1/0/T/F/TRUE/FALSE" "1/0/T/F/TRUE/FALSE" "1/0/T/F/TRUE/FALSE" "1/0/T/F/TRUE/FALSE" ...
.. ..$ actual : chr [1:9071] "68.38" "68.38" "68.38" "68.38" ...
.. ..$ file : chr [1:9071] "'OSMV-20140801.csv'" "'OSMV-20140801.csv'" "'OSMV-20140801.csv'" "'OSMV-20140801.csv'" ...
Upvotes: 0
Views: 194
Reputation: 6226
Since your datasets are "big", I recommend you to use data.table
. The solution below is untested.
First, convert all your tables to data.table
format:
library(data.table)
lapply(SPX400, setDT)
Then, define a generic function
Usedfilter <- function(df){
df2 <- data.table::copy(df)
df2[,fifteen := get('expirDate') - get('trade_date') - 15]
df2 <- df[get('fifteen') > 0]
df2[, ('Stockandexpricedif') := get('stkPx') - get('strike')]
#return(
# df2[between(get('Stockandexpricedif') , -10, 10, incbounds=FALSE)]
#)
return(df2[which.min(fifteen)])
}
And then, apply on all your dataframes:
SPX400_filtered <- lapply(SPX400, Usedfilter)
Upvotes: 1
Reputation: 388817
Try using :
new_data <- Filter(length, SPX400)
SPXfilter <- lapply(new_data, function(df) subset(df,
abs(expirDate - trade_date) >= 15 & abs(stkPx - strike) < 10))
This selects rows where the absolute difference between expirDate
and trade_date
is greater than equal to 15 days and absolute difference between stkPx
and strike
is less than 10.
Upvotes: 0