Nneka
Nneka

Reputation: 1870

sum by condition in data.table in R

example data.table:

example <- data.table(name=c('black','black','black','red','red'),
                 type=c('chair','chair','sofa','sofa','plate'),
                 num=c(4,5,12,4,3), 
                 cost = c(20,22,219,17,4))

I want to summarize this data.table. For every name i want to know how many item are there. Then i am also interested to know the cost of chairs, sofas and plates for each color. So i would get:

example <- data.table(name=c('black','red'),
                      count=c(3,2),
                      chair = c(44,0), plate = c(0,4), NOsofa = c(219,17))

i can get the count:

example[,.(count = .N), by="name"]

but struggle how to create thee remaining columns?

Upvotes: 0

Views: 72

Answers (2)

akrun
akrun

Reputation: 887981

We can make use of the fun.aggregate in dcast and do a join

library(data.table)
dcast(example, name ~ type, value.var = 'cost', sum)[example[,
     .(count = .N), name], on = .(name)]
#    name chair plate sofa count
#1: black    42     0  219     3
#2:   red     0     4   17     2

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389335

You could first add count for each name.

library(data.table)
example[,count := .N, name]

Then sum cost and reshape

dcast(example[, .(cost  = sum(cost)), .(name, type, count)], 
          name + count~type, value.var = 'cost', fill = 0)


#    name count chair plate sofa
#1: black     3    42     0  219
#2:   red     2     0     4   17

Using tidyverse this could be done as :

library(dplyr)

example %>%
  group_by(name) %>%
  mutate(count = n()) %>%
  group_by(type, count, add = TRUE) %>%
  summarise(cost = sum(cost)) %>%
  tidyr::pivot_wider(names_from = type, values_from = cost, 
         names_prefix = 'NO', values_fill = list(cost = 0))

Upvotes: 2

Related Questions