Reputation: 153
I have a dataframe:
set.seed(42)
ID <- sample(1:15, 100, replace = TRUE)
value <- sample(1:4, 100, replace = TRUE)
d <- data.frame(ID, value)
I want to group by ID, and create a new column where each value is subtracted from all others within the group.
Like sum add all of these values into a single column, how do I subtract?
library(dplyr)
d %>%
group_by(ID) %>%
# what's the - equivalent!
mutate(value_c = sub(value))
Thanks
J
Upvotes: 0
Views: 1023
Reputation: 887951
An option with data.table
library(data.table)
setDT(d)[, value_c := 2 * value - sum(value), ID]
Upvotes: 2
Reputation: 102880
Here is a base R option using ave
transform(
d,
value_c = 2*value - ave(value,ID,FUN = sum)
)
Upvotes: 2
Reputation: 5429
Well, its a somewhat odd calculation, but slightly to my own surprise, the following seems to do what you explain:
set.seed(42)
ID <- sample(1:15, 100, replace = TRUE)
value <- sample(1:4, 100, replace = TRUE)
d <- data.frame(ID, value)
d %>% group_by( ID ) %>%
mutate(
value_c = value*2 - sum(value)
) %>%
arrange( ID ) %>%
head( n=20 )
Produces:
# A tibble: 20 x 3
# Groups: ID [3]
ID value value_c
<int> <int> <dbl>
1 1 1 -12
2 1 1 -12
3 1 4 -6
4 1 1 -12
5 1 1 -12
6 1 2 -10
7 1 4 -6
8 2 4 -21
9 2 3 -23
10 2 3 -23
11 2 2 -25
12 2 1 -27
13 2 1 -27
14 2 3 -23
15 2 3 -23
16 2 1 -27
17 2 4 -21
18 2 4 -21
19 3 4 -8
20 3 4 -8
You multiply value by 2 because its going to be in the sum() anyway, which you didn't want, so adding it back on the left side takes care of that.
Upvotes: 3