volkan g
volkan g

Reputation: 296

Finding the minimum of a column and adding a new column by a group

I need to find and add a column with value '1':

So for the following DT,

set.seed(2)
library(data.table)
test_surfaces<-data.table(
  x_axis=c(sample(1:10,20,replace = T)),
  y_axis=c(sample(1:10,20,replace = T)),
  z_axis=c(sample(1:3,20,replace=T)),
  avail=c(sample(0:1,20,replace=T))
)

I tried

test_surfaces[,.SD[which(y_axis==min(y_axis))][1,candidate:=1],by=z_axis]
test_surfaces[,.SD[which(x_axis==min(x_axis))][1,candidate:=1],by=z_axis]

which gives output

   z_axis x_axis y_axis avail cand
1:      3      2      2     0    1
2:      3      6      2     0   NA
3:      3      1      2     1   NA
4:      1      6      1     0    1
5:      2      9      4     0    1

which is ok for finding the minimum by group. But

  1. It does not compare x_axises and select the one with minimum y_axis (i.e. it should select row 3, instead of row 1 for z_axis=3
  2. Data table is not modified permanently (i.e. column 'cand' is not permanenet). For this, i know I can use join(), but it's not elegant.

Upvotes: 1

Views: 72

Answers (1)

Mike H.
Mike H.

Reputation: 14360

You could try to sort by x_axis and y_axis and then simply assign the first value to be 1 for each group. This will place the lowest x_axis/y_axis combination first for each group.

test_surfaces[order(z, x_axis, y_axis), 
              candidate:=rep(c(1,0), times = c(1, .N - 1)), by = z]
test_surfaces[order(z, y_axis, x_axis), candidate:=c(1, candidate[-1]), by = z]

head(test_surfaces[order(z, x_axis, y_axis)])
#   x_axis y_axis z avail candidate
#1:      2      2 1     0         1
#2:      3      2 1     0         0
#3:      5      6 1     0         0
#4:      6      1 1     0         1
#5:      6      9 1     1         0
#6:      8      4 1     1         0

Upvotes: 1

Related Questions