Reputation: 946
I am doing operations in two columns of a data.table in R. I have first filtered the datatable by DATE, by1min and by EX, but then I realized that I actually don't want the data within the 1 minute bin. I want it for everything before that bin.
What I had before was:
bid_ask <- new_file[,list(btail=tail(BID,n=1),atail=tail(ASK,n=1)), by=c("DATE","by1min","EX")]
And now I wanted to do something like:
bid_ask <- new_file[,list(btail=tail(BID,n=1),atail=tail(ASK,n=1)), by=c("DATE",TIME_M[TIME_M<by1min],"EX")]
But this does not seem to be the structure for data.table.
Does anyone know how to filter conditionally within "by"?
My data looks like this:
DATE TIME_M EX SYM_SUFFIX BID ASK by1min
1: 2016-03-01 14400.02 P NA 60.00 95.00 14400
2: 2016-03-01 24889.07 T NA 60.01 65.00 24840
3: 2016-03-01 24889.25 T NA 60.01 64.99 24840
4: 2016-03-01 25085.24 T NA 60.01 64.99 25080
5: 2016-03-01 25085.36 T NA 62.50 64.99 25080
6: 2016-03-01 25103.37 T NA 62.51 64.99 25080
Upvotes: 0
Views: 274
Reputation: 25225
The syntax is something like this:
bid_ask <- new_file[, list(btail=tail(BID, n=1),atail=tail(ASK, n=1)), by=.(DATE, TIME_M < by1min, EX)]
bid_ask
# DATE TIME_M EX btail atail
#1: 2016-03-01 FALSE P 60.00 95.00
#2: 2016-03-01 FALSE T 62.51 64.99
#3: 2016-03-01 TRUE T 62.51 64.99
#4: 2016-03-01 TRUE P 60.00 95.00
from ?data.table
documentation:
by accepts:
- a list() of expressions of column names: e.g., DT[, .(sa=sum(a)), by=.(x=x>0, y)]
sample data:
dat <- fread("DATE TIME_M EX SYM_SUFFIX BID ASK by1min
2016-03-01 14400.02 P NA 60.00 95.00 14400
2016-03-01 24889.07 T NA 60.01 65.00 24840
2016-03-01 24889.25 T NA 60.01 64.99 24840
2016-03-01 25085.24 T NA 60.01 64.99 25080
2016-03-01 25085.36 T NA 62.50 64.99 25080
2016-03-01 25103.37 T NA 62.51 64.99 25080
2016-03-01 25000 T NA 62.51 64.99 25080
2016-03-01 14399 P NA 60.00 95.00 14400")
Upvotes: 1