aggregate data and list frequencies of a variable in a string

Question

I have the following datatable

dt<-data.table(string=c("A","A","A","B","B","C"), source=c("a","b","c","a","b","a"), N=rep(1,6))
    string source  N
1:      A      a   1
2:      A      b   1
3:      A      c   1
4:      B      a   1
5:      B      b   1
6:      C      a   1

What I would like to have, is an aggregated table of the following form:

   string   a_b_c
1:      A   1|1|1
2:      B   1|1|0
3:      C   1|0|0

That means, I would like to drop the column with the source of N, but keep this information in a new column that contains information whether we have information from source a|b|c. The problem is that I do not know how to add the missing 0 if we do not have this information in the original dt.

If I try the code dt <- dt[, paste(N, collapse="|"), by = string], I get something like this:

   string     V1
 1:      A  1|1|1
 2:      B    1|1
 3:      C      1

Thank you very much for your help!

Onyambu · Accepted Answer

library(data.table)
dcast(dt,string~source,fill=0,value.var = "N")[,paste0(.SD,collapse = "|"),by=string]
   string    V1
1:      A 1|1|1
2:      B 1|1|0
3:      C 1|0|0

aggregate data and list frequencies of a variable in a string

Answers (2)

Related Questions