AZhao
AZhao

Reputation: 14405

calculate column conditional on other columns r

I'm trying to create a new column that is conditionally based on several other columns. Here is my data. I am trying to create a year over year difference column.

> person <- c(rep("A", 4), rep("B", 1), rep("C",3), rep("D",1))
> score <- c(1,1,2,4,1,1,2,2,3)
> year <- c(2017, 2016, 2015, 2014, 2015, 2017, 2015, 2014, 2017)

This function would look for the previous year's data for that individual person, and that score from their current score. If there is no previous year data, then it returns NA. So for my data, I would get a new column "difference" that has values 0, -1, -2, NA, NA, NA, 0, NA, NA.

Would love to see dplyr answer, but vanilla r solutions welcome.

Upvotes: 0

Views: 310

Answers (2)

BENY
BENY

Reputation: 323226

By using dplyr

library(dplyr)
df %>%
  arrange(person, year) %>%
  group_by(person) %>%
  mutate(per = ifelse(year - lag(year) == 1, score - lag(score), NA)) %>%
  arrange(person, -year)

# A tibble: 9 x 4
# Groups:   person [4]
  person score  year   per
  <fctr> <dbl> <dbl> <dbl>
1      A     1  2017     0
2      A     1  2016    -1
3      A     2  2015    -2
4      A     4  2014    NA
5      B     1  2015    NA
6      C     1  2017    NA
7      C     2  2015     0
8      C     2  2014    NA
9      D     3  2017    NA

Upvotes: 2

Yin
Yin

Reputation: 161

Just to answer the question you put forward under Wen's answer. you can check out chapter 5 of this book (http://r4ds.had.co.nz/transform.html)to figure out every function and symbol used in Wen's answer. Also you can read this(http://varianceexplained.org/r/teach-tidyverse/) to get a basic sense of basic r versus tidyverse.

Upvotes: 1

Related Questions