psychcoder
psychcoder

Reputation: 675

Calculate proportion of values within subgroup

I am attempting to calculate the proportion of values within type by cond, but am having trouble calculating the sum of type by cond first. Does anyone have advice? Thank you!

toy dataset

cond  type  value
x      A     2
x      A     4
x      B     1
y      C     7
y      D     2
y      D     3
y      E     5
...    ...   ...

Desired output:
So for example, the proportion of A would be 6/(6+1) = .857

cond type sum  proportion
x     A    6   .857
x     B    1   .143
y     C    7   .411
y     D    5   .294
y     E    5   .294
...   ...   ...

Upvotes: 1

Views: 704

Answers (3)

Ronak Shah
Ronak Shah

Reputation: 389155

For completeness, here is a data.table solution :

library(data.table)
setDT(df)[, sum(value), .(cond, type)][, proportion := V1/sum(V1), cond][]
#OR using prop.table
#setDT(df)[, sum(value), .(cond, type)][, proportion := prop.table(V1), cond][]

#   cond type V1 proportion
#1:    x    A  6  0.8571429
#2:    x    B  1  0.1428571
#3:    y    C  7  0.4117647
#4:    y    D  5  0.2941176
#5:    y    E  5  0.2941176

Upvotes: -1

Onyambu
Onyambu

Reputation: 79288

Another base R option will be:

transform(aggregate(value~.,df,sum), prop = ave(value, cond,FUN = prop.table))

  cond type value      prop
1    x    A     6 0.8571429
2    x    B     1 0.1428571
3    y    C     7 0.4117647
4    y    D     5 0.2941176
5    y    E     5 0.2941176

Upvotes: 1

akrun
akrun

Reputation: 887531

We can do a group by sum in summarise. By default, the last grouping is dropped after the summarise, so, use mutate to divide the 'Sum' by the sum of 'Sum' column

library(dplyr)
df1 %>%
    group_by(cond, type) %>%
    summarise(Sum = sum(value)) %>%
    mutate(proportion = Sum/sum(Sum))
# A tibble: 5 x 4
# Groups:   cond [2]
#  cond  type    Sum proportion
#  <chr> <chr> <int>      <dbl>
#1 x     A         6      0.857
#2 x     B         1      0.143
#3 y     C         7      0.412
#4 y     D         5      0.294
#5 y     E         5      0.294

Or using prop.table from base R

prop.table(xtabs(value ~ cond + type, df1), 1)

data

df1 <- structure(list(cond = c("x", "x", "x", "y", "y", "y", "y"), type = c("A", 
"A", "B", "C", "D", "D", "E"), value = c(2L, 4L, 1L, 7L, 2L, 
3L, 5L)), class = "data.frame", row.names = c(NA, -7L))

Upvotes: 0

Related Questions