Reputation: 15
I have the following datatable
dt<-data.table(string=c("A","A","A","B","B","C"), source=c("a","b","c","a","b","a"), N=rep(1,6))
string source N
1: A a 1
2: A b 1
3: A c 1
4: B a 1
5: B b 1
6: C a 1
What I would like to have, is an aggregated table of the following form:
string a_b_c
1: A 1|1|1
2: B 1|1|0
3: C 1|0|0
That means, I would like to drop the column with the source of N, but keep this information in a new column that contains information whether we have information from source a|b|c. The problem is that I do not know how to add the missing 0 if we do not have this information in the original dt.
If I try the code
dt <- dt[, paste(N, collapse="|"), by = string]
, I get something like this:
string V1
1: A 1|1|1
2: B 1|1
3: C 1
Thank you very much for your help!
Upvotes: 0
Views: 69
Reputation: 79228
library(data.table)
dcast(dt,string~source,fill=0,value.var = "N")[,paste0(.SD,collapse = "|"),by=string]
string V1
1: A 1|1|1
2: B 1|1|0
3: C 1|0|0
Upvotes: 1
Reputation: 2088
Something like that ? Not as neat as the one liner from the comments.. (with dplyr and data.table which you already use)
dt2 <- dt %>%
data.table::dcast(string ~ source) %>%
mutate(a_b_c = paste(a, b, c, sep="|")) %>%
select(string, a_b_c)
giving
string a_b_c
1 A 1|1|1
2 B 1|1|0
3 C 1|0|0
Upvotes: 1