newbie
newbie

Reputation: 907

How to apply function in each row in data.table

Supposed I have the following

dt <- data.table(a=c(T,T,F,F), b= c(T,F,T,F))

return,

       a     b
1:  TRUE  TRUE
2:  TRUE FALSE
3: FALSE  TRUE
4: FALSE FALSE

I have tried to use function(x) min(which(x)) to determine the first TRUE of each row in dt, but it did not work. The result that I expect will be

       a     b index
1:  TRUE  TRUE     1
2:  TRUE FALSE     1
3: FALSE  TRUE     2
4: FALSE FALSE  9999

,where the index column represents the position of first TRUE and 9999 is used when that row contains only FALSE

FYI: In the real data, I have around 50 columns that contains TRUE and FALSE

Could you please give me suggestions?

Upvotes: 6

Views: 2360

Answers (3)

alexis_laz
alexis_laz

Reputation: 13122

Also,

ans = rep_len(9999L, nrow(dt))
for(i in length(dt):1L) ans[dt[[i]]] = i
ans
#[1]    1    1    2 9999

Upvotes: 2

LyzandeR
LyzandeR

Reputation: 37889

A bit late but this is a way:

#initial data.table - added a row id
dt <- data.table(a=c(T,T,F,F), b= c(T,F,T,F))[, id := .I]

#if the row sums equal 0 then 9999 else pick the first max i.e. the first TRUE
dt[, index := if(rowSums(.SD)==0) 9999 else as.double(which.max(.SD)), by=id]

Or as per @David's comment, in order to avoid the matrix conversion:

dt[, index := if(Reduce('+', .SD)==0) 9999 else as.double(which.max(.SD)), by=id]

Output:

> dt
       a     b id index
1:  TRUE  TRUE  1     1
2:  TRUE FALSE  2     1
3: FALSE  TRUE  3     2
4: FALSE FALSE  4  9999

Upvotes: 2

akrun
akrun

Reputation: 887831

For 50 columns, it is better to use max.col

dt$index <- max.col(dt, 'first') *(!!rowSums(dt))

Or as @David Arenburg mentioned, more idiomatic code would be

dt[, indx := max.col(.SD,ties.method="first")*(!!rowSums(.SD))]

If we need 9999

 (max.col(dt)*(!!rowSums(dt))) + (!rowSums(dt))*9999

Upvotes: 6

Related Questions