Reputation: 1763
I would like to create a new column difference in a data.frame according to condition, for example I have this data.frame :
structure(list(ID = c(1, 1, 2, 2), Condition = c("a", "b", "a",
"b"), Value = c(20, 30, 50, 45)), class = "data.frame", row.names = c(NA,
-4L))
ID Condition Value
1 1 a 20
2 1 b 30
3 2 a 50
4 2 b 45
Then for each ID, I would like to obtain a new column with Value when Condition = a and Value difference b-a when Condition = b. On other words, I would like to obtain this but I'm struggling :
ID Condition Value Diff
1 1 a 20 20
2 1 b 30 10
3 2 a 50 50
4 2 b 45 -5
How would you proceed to do this ? Thanks
Upvotes: 0
Views: 75
Reputation: 11546
Will this work:
library(dplyr)
df %>%
arrange(ID, Condition) %>%
mutate(Diff = case_when(Condition == 'a' ~ Value,
TRUE ~ Value - lag(Value)))
ID Condition Value Diff
1 1 a 20 20
2 1 b 30 10
3 2 a 50 50
4 2 b 45 -5
Upvotes: 2
Reputation: 389265
You can do -
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Diff = replace(Value, Condition == 'b', Value[Condition == 'b'] - Value[Condition == 'a'])) %>%
#Can also use ifelse if it is easier to understand
#mutate(Diff = ifelse(Condition == 'b', Value[Condition == 'b'] - Value[Condition == 'a'], Value)) %>%
ungroup
# ID Condition Value Diff
# <dbl> <chr> <dbl> <dbl>
#1 1 a 20 20
#2 1 b 30 10
#3 2 a 50 50
#4 2 b 45 -5
If in your real data you have only two conditions and want to subtract 2nd value with the 1st value this can also be reduced to -
df %>%
arrange(ID, Condition) %>%
group_by(ID) %>%
mutate(Diff = replace(Value, n(), diff(Value)))
Upvotes: 1