Reputation: 35

create new column using differences of rows

I have a dataset as below.

How can I create a new column B using the difference of values in A with matching ID. Apologies if this has been asked before. Thanks

Upvotes: 0

Answers (3)

Callum Savage

Reputation: 341

Another approach could be to pivot the table so that the two 'A' values are in separate columns.

library(tidyverse)

df %>% 
  mutate(name = if_else(duplicated(ID), "A_additional", "A")) %>% 
  pivot_wider(id_cols = ID, values_from = A, names_from = name) %>% 
  mutate(B = A - A_additional)

# # A tibble: 4 x 4
#   ID        A A_additional     B
#   <fct> <dbl>        <dbl> <dbl>
# 1 aa        2            3    -1
# 2 bb        4            4     0
# 3 cc        3            2     1
# 4 dd        1            1     0

This solution doesn't require grouping, so should scale well to larger data sets.

Upvotes: 0

akrun

Reputation: 887251

We can use data.table methods

library(data.table)
setDT(df)[, .(B = first(A) - last(A), A = first(A)), .(ID)]

data

df <- structure(list(ID = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L
), .Label = c("aa", "bb", "cc", "dd"), class = "factor"), A = c(2, 
4, 3, 1, 3, 4, 2, 1)), class = "data.frame", row.names = c(NA, -8L))

Upvotes: 0

Ronak Shah

Reputation: 389055

Using dplyr, we can group_by ID and subtract first and last values of A.

library(dplyr)

df %>%
 group_by(ID) %>%
 summarise(B = first(A) - last(A), A = first(A)) %>%
 select(names(df), B)

# A tibble: 4 x 3
#  ID        A     B
#  <fct> <dbl> <dbl>
#1 aa        2    -1
#2 bb        4     0
#3 cc        3     1
#4 dd        1     0

data

df <- structure(list(ID = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L
), .Label = c("aa", "bb", "cc", "dd"), class = "factor"), A = c(2, 
4, 3, 1, 3, 4, 2, 1)), class = "data.frame", row.names = c(NA, -8L))

Upvotes: 4

create new column using differences of rows

Answers (3)

Related Questions