R pivot longer with both annual and monthly data

Question

I'm unsure how to structure my pivot longer command when I have both annual and monthly data. For example I have:

wide <- data.frame(region_name = character(),    # Create empty data frame
                    total_population_2019 = numeric(),
                    total_population_2020 = numeric(),
                   mean_temperature_2019_1 = numeric(),
                   mean_temperature_2019_2 = numeric(),
                   mean_temperature_2020_1 = numeric(),
                   mean_temperature_2020_2 = numeric(),
                    stringsAsFactors = FALSE)

wide[1, ] <- list("funville", 50000, 51250, 26.3, 24.6, 25.7, 24.9)

region_name total_population_2019 total_population_2020 mean_temperature_2019_1 mean_temperature_2019_2 mean_temperature_2020_1 mean_temperature_2020_2
 funville                50000             51250              26.3              24.6              25.7              24.9

I'm able to pivot on the monthly columns using spread:

long <- pivot_longer(wide, cols = 4:7, names_to = c("layer" ,"year", "month"),
                     names_pattern = "(.*)_(.*)_?_(.*)") %>%
  group_by(layer) %>%
  mutate(n = 1:n()) %>%
  spread(layer, value) %>%
  select(-n)

which gives

  region_name total_population_2019 total_population_2020 year  month mean_temperature
1 funville                    50000                 51250 2019  1                 26.3
2 funville                    50000                 51250 2019  2                 24.6
3 funville                    50000                 51250 2020  1                 25.7
4 funville                    50000                 51250 2020  2                 24.9

I'd like to now have a population column where the values are attributed for each row/month that falls in that year, ideally would look like:

desired.df <- data.frame(region_name = c("funville", "funville", "funville", "funville"),
                         year = c("2019", "2019", "2020", "2020"),
                         month = c("1", "2", "1", "2"),
                         population = c("50000", "50000", "51250", "51250"),
                         mean_temperature = c("26.3", "24.6", "25.7", "24.9"))

which gives

  region_name year  month population mean_temperature           
1 funville    2019  1     50000      26.3            
2 funville    2019  2     50000      24.6            
3 funville    2020  1     51250      25.7            
4 funville    2020  2     51250      24.9

Does anyone have a solution? Thanks in advance

R pivot longer with both annual and monthly data

Answers (1)

Related Questions