Fatih Ekici
Fatih Ekici

Reputation: 133

How to do subtraction multiple column by group in r

I have survey data from patients and relatives of the patient. That data looks like this.

X1  X2  X3  X4  X5
a   11  2   3   4
b   11  2   4   5
a   12  4   4   5
b   12  5   2   3
a   13  1   1   0
b   13  3   2   4

I want to subtract by different x1 and same x2 condition So the result should look like

X2  X3  X4  X5
11  0  -1  -1
12  -1  2   2
13  -2 -1  -4

I can solve that question using for cycle but I prefer a dplyr solution for readable code. Also, I get to experience dplyr style code

Thanks to all viewers.

Upvotes: 1

Views: 1402

Answers (3)

Peace Wang
Peace Wang

Reputation: 2419

data.table version

library(data.table)
dt <- fread("
X1  X2  X3  X4  X5
a   11  2   3   4
b   11  2   4   5
a   12  4   4   5
b   12  5   2   3
a   13  1   1   0
b   13  3   2   4")

dt[,.SD[X1 == "a"] - .SD[X1 == "b"],.SDcols = 3:5,by = X2]
# or
dt[, lapply(.SD[.N:1], diff), .SDcols = 3:5, by = X2]

Result

   X2 X3 X4 X5
1: 11  0 -1 -1
2: 12 -1  2  2
3: 13 -2 -1 -4

Upvotes: 1

Joel Buursma
Joel Buursma

Reputation: 118

Try this:

library(dplyr)
df <- data.frame( X1 = c("a","b","a","b","a","b"),
                  X2 = c(11,11,12,12,13,13),
                  X3 = c(2,2,4,5,1,3),
                  X4 = c(3,4,4,2,1,2),
                  X5 = c(4,5,5,3,0,4))
df$XMult <-with(df, case_when(X1=="a"~1, X1=="b"~-1))

df %>% 
  group_by( X2) %>%
  summarize(across(.cols = c("X3", "X4", "X5"), ~sum(.x*XMult)))

Or, alternately, without modifying df:

df %>% 
  group_by( X2) %>%
  summarize(across(.cols = c("X3", "X4", "X5"), 
                   ~sum(.x*case_when(X1=="a"~1, X1=="b"~-1))))

Output:

`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 3 x 4
     X2    X3    X4    X5
  <dbl> <dbl> <dbl> <dbl>
1    11     0    -1    -1
2    12    -1     2     2
3    13    -2    -1    -4

One more slightly-clumsy alternative that uses lag instead of the diff in Ben's answer:

df %>% 
  group_by( X2) %>%
  mutate(across(.cols = X3:X5, ~(lag(.x) - .x))) %>%
  filter( X1 == "b") %>%
  select( -"X1")

Upvotes: 2

Desmond
Desmond

Reputation: 1147

This is @Ben's answer but modifying it slightly to get the result you specified:

Data:

library(dplyr)

df <- tibble::tribble(~ X1..X2..X3..X4..X5,
                      "a   11  2   3   4",
                      "b   11  2   4   5",
                      "a   12  4   4   5",
                      "b   12  5   2   3",
                      "a   13  1   1   0",
                      "b   13  3   2   4") %>%
  separate(col = "X1..X2..X3..X4..X5",
           into = c("x1", "x2", "x3", "x4", "x5")) %>%
  mutate(across(2:5, as.numeric))

Wrangling with dplyr:

df %>% 
  group_by(x2) %>% 
  summarise(-across(x3:x5, diff))

Output:

# A tibble: 3 x 4
     x2    x3    x4    x5
* <dbl> <dbl> <dbl> <dbl>
1    11     0    -1    -1
2    12    -1     2     2
3    13    -2    -1    -4

Upvotes: 3

Related Questions