Irina
Irina

Reputation: 29

R data.table, how to get complete dt, when using .SDcols

consider following dt:

dt <- data.table(id=c(rep(1,11),rep(2,10)),col1=c(100:80),col2=c(95:75),col3=c(2,100:81)) 
changeCols <- c("col1","col2","col3")
newCols <- paste0(changeCols, "_diff")
dt[, (newCols) := .SD - shift(.SD), by=id, .SDcols=changeCols][] 

Now, I need to detect "jumps" on the second position (in that case remove the first position) in several columns, like it is the case in the col3_diff, I try do do it via:

dt[, if (.SD[2]>0) .SD[-1] else .SD, by=id, .SDcols=newCols] 

I get only the subset of the data.table, i.e. I lose col1,col2, col3, but I need the complete dt.

Any idea how to do it?

Thanks a lot in advance!

Upvotes: 0

Views: 135

Answers (2)

r2evans
r2evans

Reputation: 160597

library(data.table)
dt[, .SD[rowSums(!sapply(.SD[, newCols, with=FALSE],
                         function(z) z[2] < 0 | seq_along(z) != 1L)) == 0,],
   by = id]
#        id  col1  col2  col3 col1_diff col2_diff col3_diff
#     <num> <int> <int> <num>     <int>     <int>     <num>
#  1:     1    99    94   100        -1        -1        98
#  2:     1    98    93    99        -1        -1        -1
#  3:     1    97    92    98        -1        -1        -1
#  4:     1    96    91    97        -1        -1        -1
#  5:     1    95    90    96        -1        -1        -1
#  6:     1    94    89    95        -1        -1        -1
#  7:     1    93    88    94        -1        -1        -1
#  8:     1    92    87    93        -1        -1        -1
#  9:     1    91    86    92        -1        -1        -1
# 10:     1    90    85    91        -1        -1        -1
# 11:     2    89    84    90        NA        NA        NA
# 12:     2    88    83    89        -1        -1        -1
# 13:     2    87    82    88        -1        -1        -1
# 14:     2    86    81    87        -1        -1        -1
# 15:     2    85    80    86        -1        -1        -1
# 16:     2    84    79    85        -1        -1        -1
# 17:     2    83    78    84        -1        -1        -1
# 18:     2    82    77    83        -1        -1        -1
# 19:     2    81    76    82        -1        -1        -1
# 20:     2    80    75    81        -1        -1        -1
#        id  col1  col2  col3 col1_diff col2_diff col3_diff

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389135

You can use the following -

library(data.table)
if(any(dt[2, ..newCols] > 0)) dt <- dt[-2]

Upvotes: 0

Related Questions