Reputation: 23
I have a dataframe of country GDP and population from 2010 to 2100. I have leftjoined gdp growth for 3 years for each country(2020-2022). I would like to replicate these 3 years with a gap of NAs inbetween. I tried the following. It appeared to work, however on closer inspection it turned out that the sequence (NAs and 3 x GDP growth) started from 2010 but only appeared for the range that i required.
DF<-DF%>%
group_by(Country)%>%
mutate(gdp_forecast=ifelse(Year>2022, rep(c(rep(NA, 2),gdp_forecast[!is.na(gdp_forecast)]), length.out=78), gdp_forecast))
This is an example of the current results.
structure(list(Country = c("Afghanistan", "Afghanistan", "Afghanistan",
"Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan",
"Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan",
"Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan",
"Afghanistan", "Afghanistan"), gdp = c(15936.80064, 16873.912078,
17811.023516, 18748.134954, 19685.246392, 20622.35783, 21648.660914,
22674.963998, 23701.267082, 24727.570166, 25753.87325, 27258.147678,
28762.422106, 30266.696534, 31770.970962, 33275.24539, 35342.146188,
37409.046986, 39475.947784, 41542.848582), gdp_forecast = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, -5.5, 2.5, 3.3, NA, NA, NA,
NA, NA, NA, NA), Year = c(2010, 2011, 2012, 2013, 2014, 2015,
2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024, 2025, 2026,
2027, 2028, 2029), gdp_forecast1 = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 2.5, 3.3, NA, NA, -5.5, 2.5, 3.3)), row.names = c(NA,
-20L), groups = structure(list(Country = "Afghanistan", .rows = structure(list(
1:20), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr",
"list"))), row.names = 1L, class = c("tbl_df", "tbl", "data.frame"
), .drop = TRUE), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
))
This is the expected output
Country | gdp | gdp_forecast | Year | gdp_forecast1 | |
---|---|---|---|---|---|
8 | Afghanistan | 22675. | NA | 2017 | NA |
9 | Afghanistan | 23701. | NA | 2018 | NA |
10 | Afghanistan | 24728. | NA | 2019 | NA |
11 | Afghanistan | 25754. | -5.5 | 2020 | NA |
12 | Afghanistan | 27258. | 2.5 | 2021 | NA |
13 | Afghanistan | 28762. | 3.3 | 2022 | NA |
14 | Afghanistan | 30267. | NA | 2023 | NA |
15 | Afghanistan | 31771. | NA | 2024 | NA |
16 | Afghanistan | 33275. | NA | 2025 | -5.5 |
17 | Afghanistan | 35342. | NA | 2026 | 2.5 |
18 | Afghanistan | 37409. | NA | 2027 | 3.3 |
19 | Afghanistan | 39476. | NA | 2028 | NA |
20 | Afghanistan | 41543. | NA | 2029 | NA |
Upvotes: 1
Views: 46
Reputation: 886928
We can use rep
with length.out
library(dplyr)
DF %>%
group_by(Country) %>%
mutate(gdp_forecast1 = NA_real_,
gdp_forecast1 = replace(gdp_forecast1, Year > 2022,
rep(c(NA, NA, gdp_forecast[!is.na(gdp_forecast)]),
length.out = sum(Year > 2022))))
-output
# A tibble: 20 x 5
# Groups: Country [1]
Country gdp gdp_forecast Year gdp_forecast1
<chr> <dbl> <dbl> <dbl> <dbl>
1 Afghanistan 15937. NA 2010 NA
2 Afghanistan 16874. NA 2011 NA
3 Afghanistan 17811. NA 2012 NA
4 Afghanistan 18748. NA 2013 NA
5 Afghanistan 19685. NA 2014 NA
6 Afghanistan 20622. NA 2015 NA
7 Afghanistan 21649. NA 2016 NA
8 Afghanistan 22675. NA 2017 NA
9 Afghanistan 23701. NA 2018 NA
10 Afghanistan 24728. NA 2019 NA
11 Afghanistan 25754. -5.5 2020 NA
12 Afghanistan 27258. 2.5 2021 NA
13 Afghanistan 28762. 3.3 2022 NA
14 Afghanistan 30267. NA 2023 NA
15 Afghanistan 31771. NA 2024 NA
16 Afghanistan 33275. NA 2025 -5.5
17 Afghanistan 35342. NA 2026 2.5
18 Afghanistan 37409. NA 2027 3.3
19 Afghanistan 39476. NA 2028 NA
20 Afghanistan 41543. NA 2029 NA
Upvotes: 1