Leonhardt Guass
Leonhardt Guass

Reputation: 793

How to divide between groups of rows using dplyr with multiple columns?

My question is an extension of this question. I want to figure out how to divide between groups of rows using dplyr with multiple columns, instead of for a single variable.

I have this dataframe:

x <- data.frame(
    name = rep(letters[1:4], each = 2),
    condition = rep(c("A", "B"), times = 4),
    value1 = c(2,10,4,20,8,40,20,100),
    value2 = c(2,10,4,20,8,40,20,100)
) 
#   name condition value1 value2
# 1    a         A     2       2
# 2    a         B    10       10
# 3    b         A     4       4
# 4    b         B    20       20
# 5    c         A     8       8
# 6    c         B    40       40
# 7    d         A    20       20
# 8    d         B   100       100

I want to group by name and divide the value of rows with condition == "B" with those with condition == "A", to get this:

data.frame(
    name = letters[1:4],
    value1 = c(5,5,5,5),
    value2 = c(5,5,5,5)
)
#   name value1 value2
# 1    a     5       5
# 2    b     5       5
# 3    c     5       5
# 4    d     5       5

There is the most upvoted answer there by Steven Beaupré for the original question with a single variable:

x %>% 
group_by(name) %>%
summarise(value = value[condition == "B"] / value[condition == "A"])

But that answer is for single value case, I don't know how to extend to "summarise_at" and "summarise_all". I tried to use "dot", but cannot figure out the correct syntax.

Upvotes: 1

Views: 431

Answers (2)

Leonhardt Guass
Leonhardt Guass

Reputation: 793

I figured out a way to do it.

x %>% gather(variable, value, -(name:condition)) %>%
group_by(variable,name) %>%
summarise(value = value[condition == "B"] / value[condition == "A"]) %>%
spread(variable,value)

#  name  value1 value2
#   <fct>  <dbl>  <dbl>
# 1 a          5      5
# 2 b          5      5
# 3 c          5      5
# 4 d          5      5

Upvotes: 1

Trent
Trent

Reputation: 813

I'm not sure if there is a way to automatically extend this function to every variable. I think you need to specify the summary function for each value case.

x %>%
  group_by(name) %>%
  summarise(value1 = value1[condition == "B"] / value1[condition == "A"],
            value2 = value2[condition == "B"] / value2[condition == "A"])

#  name  value1 value2
#   <fct>  <dbl>  <dbl>
# 1 a          5      5
# 2 b          5      5
# 3 c          5      5
# 4 d          5      5

Upvotes: 0

Related Questions