Sam
Sam

Reputation: 1482

R perform summary operation and subset result by data.table column

I want to use a list external to my data.table to inform what a new column of data should be, in that data.table. In this case, the length of the list element corresponding to a data.table attribute;

# dummy list. I am interested in extracting the vector length of each list element
l <- list(a=c(3,5,6,32,4), b=c(34,5,6,34,2,4,6,7), c = c(3,4,5))

# dummy dt, the underscore number in Attri2 is the element of the list i want the length of
dt <- data.table(Attri1 = c("t","y","h","g","d","e","d"), 
                 Attri2 = c("fghd_1","sdafsf_3","ser_1","fggx_2","sada_2","sfesf_3","asdas_2"))

# extract that number to a new attribute, just for clarity
dt[, list_gp := tstrsplit(Attri2, "_", fixed=TRUE, keep=2)]

# then calculate the lengths of the vectors in the list, and attempt to subset by the index taken above
dt[,list_len := '[['(lapply(1, length),list_gp)]

Error in lapply(l, length)[[list_gp]] : no such index at level 1

I envisaged the list_len column to be 5,3,5,8,8,3,8

Upvotes: 1

Views: 48

Answers (1)

Sirius
Sirius

Reputation: 5429

A couple of things.

  1. tstrsplit gives you a string. convert to number.
  2. not quite sure about the [[ construct there, see proposed solution:

dt[, list_gp := as.numeric( tstrsplit(Attri2, "_", fixed=TRUE, keep=2)[[1]] )]

dt[, list_len := sapply( l[ list_gp ], length ) ]

Output:


> dt
   Attri1   Attri2 list_gp list_len
1:      t   fghd_1       1        5
2:      y sdafsf_3       3        3
3:      h    ser_1       1        5
4:      g   fggx_2       2        8
5:      d   sada_2       2        8
6:      e  sfesf_3       3        3
7:      d  asdas_2       2        8

Upvotes: 2

Related Questions