Reputation: 35
I have a dataset as below.
How can I create a new column B using the difference of values in A with matching ID. Apologies if this has been asked before. Thanks
Upvotes: 0
Views: 67
Reputation: 341
Another approach could be to pivot the table so that the two 'A' values are in separate columns.
library(tidyverse)
df %>%
mutate(name = if_else(duplicated(ID), "A_additional", "A")) %>%
pivot_wider(id_cols = ID, values_from = A, names_from = name) %>%
mutate(B = A - A_additional)
# # A tibble: 4 x 4
# ID A A_additional B
# <fct> <dbl> <dbl> <dbl>
# 1 aa 2 3 -1
# 2 bb 4 4 0
# 3 cc 3 2 1
# 4 dd 1 1 0
This solution doesn't require grouping, so should scale well to larger data sets.
Upvotes: 0
Reputation: 887251
We can use data.table
methods
library(data.table)
setDT(df)[, .(B = first(A) - last(A), A = first(A)), .(ID)]
data
df <- structure(list(ID = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L
), .Label = c("aa", "bb", "cc", "dd"), class = "factor"), A = c(2,
4, 3, 1, 3, 4, 2, 1)), class = "data.frame", row.names = c(NA, -8L))
Upvotes: 0
Reputation: 389055
Using dplyr
, we can group_by
ID
and subtract first
and last
values of A
.
library(dplyr)
df %>%
group_by(ID) %>%
summarise(B = first(A) - last(A), A = first(A)) %>%
select(names(df), B)
# A tibble: 4 x 3
# ID A B
# <fct> <dbl> <dbl>
#1 aa 2 -1
#2 bb 4 0
#3 cc 3 1
#4 dd 1 0
data
df <- structure(list(ID = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L
), .Label = c("aa", "bb", "cc", "dd"), class = "factor"), A = c(2,
4, 3, 1, 3, 4, 2, 1)), class = "data.frame", row.names = c(NA, -8L))
Upvotes: 4