empetrum
empetrum

Reputation: 115

How can I subtract values within one column based on values in mutliple other columns?

I have a dataframe like this:

dat <- data.frame(c = c(rep(0, 3), rep(5, 3), rep(10, 3)),
                  id = c(rep(c("A","B","C"), 3)),
                  measurement = c(1:8, 1))

dat
#    c id measurement
# 1  0  A           1
# 2  0  B           2
# 3  0  C           3
# 4  5  A           4
# 5  5  B           5
# 6  5  C           6
# 7 10  A           7
# 8 10  B           8
# 9 10  C           1

I want to subtract the values in the column "measurement" where c is 0 from all other values in this column. This should happen separately based on the info given in the column "id". E.g. the value where c is 0 and "id" is A should be subtracted from all values where c is > 0 and "id" is A. The value where c is 0 and "id" is B should be subtracted from all values where c is > 0 and "id" is B and so on.

If the difference would be negative the result should be 0.

The result should look like this:

result <- data.frame(c = c(rep(0, 3), rep(5, 3), rep(10, 3)),
                 id = c(rep(c("A","B","C"), 3)),
                 measurement = c(1:8, 1),
                 difference = c(0,0,0,3,3,3,6,6,0))

result
#    c id measurement difference
# 1  0  A           1          0
# 2  0  B           2          0
# 3  0  C           3          0
# 4  5  A           4          3
# 5  5  B           5          3
# 6  5  C           6          3
# 7 10  A           7          6
# 8 10  B           8          6
# 9 10  C           1          0

I used dplyr to select the values of "measurement" based on the info from the other columns, but unfortunately I don't know how to do the calculations. So any suggestions are welcome!

Upvotes: 2

Views: 888

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388817

For each id you can subtract measurement values with the value where c = 0. Using pmax we replace negative values with 0.

library(dplyr)

dat %>%
  group_by(id) %>%
  mutate(difference = pmax(measurement - measurement[c == 0], 0))

#     c id    measurement difference
#  <dbl> <chr>       <dbl>      <dbl>
#1     0 A               1          0
#2     0 B               2          0
#3     0 C               3          0
#4     5 A               4          3
#5     5 B               5          3
#6     5 C               6          3
#7    10 A               7          6
#8    10 B               8          6
#9    10 C               1          0

Upvotes: 2

Duck
Duck

Reputation: 39595

Try this. You can use a join and filter the data for you defined filter. After that dplyr verbs are useful to reach the expected output:

library(dplyr)
#Code
new <- dat %>%
  left_join(
    dat %>% filter(c==0) %>% select(-c) %>% rename(Var=measurement)
  ) %>%
  mutate(measurement=measurement-Var) %>%
  replace(.<=0,0) %>% select(-Var)

Output:

   c id measurement
1  0  A           0
2  0  B           0
3  0  C           0
4  5  A           3
5  5  B           3
6  5  C           3
7 10  A           6
8 10  B           6
9 10  C           0

Upvotes: 0

Related Questions