Reputation: 518
I have a dataset in which I want to create a barplot by grouping columns, but I want the count of each group to reflect a unique column value for that group.
DT <- data.table(id=c(1,1,1,1,2,2,2,3,3,3,4,4,4,5,5),
time=c(0,1,2,3,0,1,2,0,1,2,0,1,2,0,1),
exposure=c(1,2,0,3,2,3,1,0,3,3,0,0,3,3,0),
tmin=c(rep(2,15)),
rule=c(rep("a",7),rep("b",6),rep("c",2)))
tmin <- 2
DT[time<=tmin,exposure.sum:=sum(exposure),by=rule]
I want to group the rule
column and have the counts for rule a,b,c to be 9,9,3, receptively. As seen in the exposure.sum
column
> DT[time<=tmin,]
id time exposure tmin rule exposure.sum
1: 1 0 1 2 a 9
2: 1 1 2 2 a 9
3: 1 2 0 2 a 9
4: 2 0 2 2 a 9
5: 2 1 3 2 a 9
6: 2 2 1 2 a 9
7: 3 0 0 2 b 9
8: 3 1 3 2 b 9
9: 3 2 3 2 b 9
10: 4 0 0 2 b 9
11: 4 1 0 2 b 9
12: 4 2 3 2 b 9
13: 5 0 3 2 c 3
14: 5 1 0 2 c 3
Now, I've created a simple barplot but the counts for each rule turns out to be the entire sum of exposure.sum
for each respective rule; I end up with a barplot with counts of 54,54,6 when I actually want them to be 9,9,3 (i.e. the unique value of exposure.sum
for that rule)
ggplot(DT[time<=tmin,]) +
geom_bar(aes(x=rule,y=exposure.sum),stat="identity")
Upvotes: 0
Views: 107
Reputation: 679
This is a solution using dplyr.
library(tidyverse)
DT <- data.table(id=c(1,1,1,1,2,2,2,3,3,3,4,4,4,5,5),
time=c(0,1,2,3,0,1,2,0,1,2,0,1,2,0,1),
exposure=c(1,2,0,3,2,3,1,0,3,3,0,0,3,3,0),
tmin=c(rep(2,15)),
rule=c(rep("a",7),rep("b",6),rep("c",2)))
tmin <- 2
DT[time<=tmin,exposure.sum:=sum(exposure),by=rule]
DT[time <= tmin] %>% group_by(rule) %>% summarise(exposure.sum = sum(exposure)) %>%
ggplot(aes(x = rule, y = exposure.sum)) + geom_bar(stat = 'identity') +
scale_y_continuous(breaks = (1:9)) + geom_text(aes(label = exposure.sum, y = exposure.sum - 0.6), size = 10, color = 'white')
which produces the following output:
Upvotes: 2