emmecicubo
emmecicubo

Reputation: 37

Creating a new column using scores from past years (which is in the same dataframe)

I'm sorry if this question has already been answered, but I don't really know how to phrase my question.

I have a data frame structured in this way:

country year score
France 2020 10
France 2019 9
Germany 2020 15
Germany 2019 14

I would like to have a new column called previous_year_score that would look into the data frame looking for the "score" of a country for the "year - 1". In this case France 2020 would have a previous_year_score of 9, while France 2019 would have a NA.

Upvotes: 1

Views: 37

Answers (3)

akrun
akrun

Reputation: 887641

Using case_when

library(dplyr)
df1 %>% 
   arrange(country, year) %>%
    group_by(country) %>% 
    mutate(prev_val = case_when(year - lag(year) == 1 ~ lag(score)))
# A tibble: 4 x 4
# Groups:   country [2]
  country  year score prev_val
  <chr>   <int> <int>    <int>
1 France   2019     9       NA
2 France   2020    10        9
3 Germany  2019    14       NA
4 Germany  2020    15       14

Upvotes: 2

Brian Davis
Brian Davis

Reputation: 992

You can use match() for this. I imagine there are plenty of other solutions too.

Data:

df <- structure(list(country = c("France", "France", "Germany", "Germany"
), year = c(2020L, 2019L, 2020L, 2019L), score = c(10L, 9L, 15L, 
14L), prev_score = c(9L, NA, 14L, NA)), row.names = c(NA, -4L
), class = "data.frame")

Solution:

i <- match(paste(df[[1]],df[[2]]-1),paste(df[[1]],df[[2]]))
df$prev_score <- df[i,3]

Upvotes: 3

Anoushiravan R
Anoushiravan R

Reputation: 21938

You can use the following solution:

library(dplyr)

df %>%
  group_by(country) %>%
  arrange(year) %>%
  mutate(prev_val = ifelse(year - lag(year) == 1, lag(score), NA))

# A tibble: 4 x 4
# Groups:   country [2]
  country  year score prev_val
  <chr>   <int> <int>    <int>
1 France   2019     9       NA
2 Germany  2019    14       NA
3 France   2020    10        9
4 Germany  2020    15       14

Upvotes: 2

Related Questions