How to populate a new variable based on a condition of two other variables

Question

I would like to create a variable that takes a certain value in a certain year, by country.

Country Year Price Price_2018
   A    2016  1      4
   A    2017  3      4
   A    2018  4      4
   B    2016  1      5
   B    2017  7      5
   B    2018  5      5
   C    2016  1      3
   C    2017  6      3
   C    2018  3      3

As above, I'd like to create the variable Price_2018 that, for each country, take the price in 2018 and populates each observation with this value. Would anyone be able to help here? Many thanks in advance.

Duck · Accepted Answer

It looks like you want to create a variable for each year, so you can try this approach reshaping data and then merging using tidyverse functions:

library(tidyverse)
#Code
newdf <- df %>% left_join(df %>%
  mutate(Year=paste0('Price_',Year)) %>%
  pivot_wider(names_from = Year,values_from=Price))

Output:

  Country Year Price Price_2016 Price_2017 Price_2018
1       A 2016     1          1          3          4
2       A 2017     3          1          3          4
3       A 2018     4          1          3          4
4       B 2016     1          1          7          5
5       B 2017     7          1          7          5
6       B 2018     5          1          7          5
7       C 2016     1          1          6          3
8       C 2017     6          1          6          3
9       C 2018     3          1          6          3

And if you only want 2018:

#Code 2
newdf <- df %>% left_join(df %>% filter(Year==2018) %>%
  mutate(Year=paste0('Price_',Year)) %>%
  pivot_wider(names_from = Year,values_from=Price))

Output:

  Country Year Price Price_2018
1       A 2016     1          4
2       A 2017     3          4
3       A 2018     4          4
4       B 2016     1          5
5       B 2017     7          5
6       B 2018     5          5
7       C 2016     1          3
8       C 2017     6          3
9       C 2018     3          3

Some data used:

#Data
df <- structure(list(Country = c("A", "A", "A", "B", "B", "B", "C", 
"C", "C"), Year = c(2016L, 2017L, 2018L, 2016L, 2017L, 2018L, 
2016L, 2017L, 2018L), Price = c(1L, 3L, 4L, 1L, 7L, 5L, 1L, 6L, 
3L)), row.names = c(NA, -9L), class = "data.frame")

How to populate a new variable based on a condition of two other variables

Answers (2)

data

Related Questions