Reputation: 142
I would like to create a variable that takes a certain value in a certain year, by country.
Country Year Price Price_2018
A 2016 1 4
A 2017 3 4
A 2018 4 4
B 2016 1 5
B 2017 7 5
B 2018 5 5
C 2016 1 3
C 2017 6 3
C 2018 3 3
As above, I'd like to create the variable Price_2018 that, for each country, take the price in 2018 and populates each observation with this value. Would anyone be able to help here? Many thanks in advance.
Upvotes: 0
Views: 68
Reputation: 887231
We can either use ==
(assuming there are only unique 'Year' for each 'Country')
library(dplyr)
df1 %>%
group_by(Country) %>%
mutate(Price_2018 = Price[Year == 2018])
-output
# A tibble: 9 x 4
# Groups: Country [3]
# Country Year Price Price_2018
# <chr> <int> <int> <int>
#1 A 2016 1 4
#2 A 2017 3 4
#3 A 2018 4 4
#4 B 2016 1 5
#5 B 2017 7 5
#6 B 2018 5 5
#7 C 2016 1 3
#8 C 2017 6 3
#9 C 2018 3 3
Or match
df1 %>%
group_by(Country) %>%
mutate(Price_2018 = Price[match(2018, Year)])
If we need to create multiple 'Year' column, an easier option is map
library(purrr)
map_dfc(unique(df1$Year), ~ df1 %>% group_by(Country) %>%
transmute(!! str_c('Price_', .x) := Price[Year == .x]) %>%
ungroup %>%
select(-Country)) %>%
mutate(df1, .)
# Country Year Price Price_2016 Price_2017 Price_2018
#1 A 2016 1 1 3 4
#2 A 2017 3 1 3 4
#3 A 2018 4 1 3 4
#4 B 2016 1 1 7 5
#5 B 2017 7 1 7 5
#6 B 2018 5 1 7 5
#7 C 2016 1 1 6 3
#8 C 2017 6 1 6 3
#9 C 2018 3 1 6 3
df1 <- structure(list(Country = c("A", "A", "A", "B", "B", "B", "C",
"C", "C"), Year = c(2016L, 2017L, 2018L, 2016L, 2017L, 2018L,
2016L, 2017L, 2018L), Price = c(1L, 3L, 4L, 1L, 7L, 5L, 1L, 6L,
3L)), row.names = c(NA, -9L), class = "data.frame")
Upvotes: 1
Reputation: 39605
It looks like you want to create a variable for each year, so you can try this approach reshaping data and then merging using tidyverse
functions:
library(tidyverse)
#Code
newdf <- df %>% left_join(df %>%
mutate(Year=paste0('Price_',Year)) %>%
pivot_wider(names_from = Year,values_from=Price))
Output:
Country Year Price Price_2016 Price_2017 Price_2018
1 A 2016 1 1 3 4
2 A 2017 3 1 3 4
3 A 2018 4 1 3 4
4 B 2016 1 1 7 5
5 B 2017 7 1 7 5
6 B 2018 5 1 7 5
7 C 2016 1 1 6 3
8 C 2017 6 1 6 3
9 C 2018 3 1 6 3
And if you only want 2018
:
#Code 2
newdf <- df %>% left_join(df %>% filter(Year==2018) %>%
mutate(Year=paste0('Price_',Year)) %>%
pivot_wider(names_from = Year,values_from=Price))
Output:
Country Year Price Price_2018
1 A 2016 1 4
2 A 2017 3 4
3 A 2018 4 4
4 B 2016 1 5
5 B 2017 7 5
6 B 2018 5 5
7 C 2016 1 3
8 C 2017 6 3
9 C 2018 3 3
Some data used:
#Data
df <- structure(list(Country = c("A", "A", "A", "B", "B", "B", "C",
"C", "C"), Year = c(2016L, 2017L, 2018L, 2016L, 2017L, 2018L,
2016L, 2017L, 2018L), Price = c(1L, 3L, 4L, 1L, 7L, 5L, 1L, 6L,
3L)), row.names = c(NA, -9L), class = "data.frame")
Upvotes: 2