Reputation: 754
I'm trying to select all yearly values with their corresponding month beyond years 2011, and if the rows do not match in size, then filling them in with NAs.
I have tried:
rank_data %>% filter(across(starts_with('year'))>2011)
Which only returns the output for a single column:
jan year feb year2 mar year3 apr year4 may year5 jun year6 jul year7 aug year8 sep year9 oct year10 nov year11 dec
1: 205.3 2014 188.2 2014 167.8 1920 122.1 1882 134.2 1906 152.0 1948 173.6 1882 191.3 1985 196.5 1957 234.6 1998 224.2 1931 232.6
2: 203.2 2016 179.0 1950 162.9 1868 119.5 1993 133.4 1925 147.3 1980 156.4 1932 181.1 1900 193.6 1927 211.6 1862 222.5 1928 230.1
year12 win year13 spr year14 sum year15 aut year16 ann year17
1: 1979 529.4 1960 348.9 1983 438.8 1928 527.8 1944 1547.8 2008
2: 1994 522.5 1994 340.4 1903 426.6 2007 519.6 1923 1530.2 2015
reproducible code:
structure(list(jan = c(268.1, 263.1, 235.2, 223.3, 219.2, 218.3
), year = c(1928, 1948, 2008, 1877, 1995, 1990), feb = c(287.6,
241.9, 213.7, 205.1, 191.9, 191.2), year2 = c(2020, 2002, 1997,
1990, 1958, 1923), mar = c(225.3, 190.7, 187.8, 187.2, 175.9,
173.9), year3 = c(1981, 1903, 2019, 1947, 1994, 1912), apr = c(147.4,
147.1, 143.1, 138.1, 132.4, 128.4), year4 = c(1920, 1867, 1970,
1913, 1947, 2012), may = c(166.5, 161.1, 153.4, 142.1, 141.8,
139.1), year5 = c(1967, 1924, 1886, 1920, 2015, 1993), jun = c(205.6,
173.7, 172.5, 170.6, 161.1, 161.1), year6 = c(2012, 1928, 2007,
1907, 1872, 1998), jul = c(196.1, 191.4, 183.6, 180.7, 180.3,
178.9), year7 = c(1939, 1888, 1920, 1988, 2009, 1880), aug = c(260.9,
231, 209.7, 205.4, 203.9, 197.7), year8 = c(1956, 1917, 1891,
1927, 1879, 2004), sep = c(287.1, 238.1, 223.3, 208.8, 207.6,
206.2), year9 = c(1918, 1950, 1869, 1935, 1866, 1872), oct = c(286.1,
278.8, 263.8, 259.6, 249.9, 248.4), year10 = c(1967, 1903, 1954,
2000, 1938, 1870), nov = c(306.7, 260.9, 253.6, 252.6, 242.3,
235.8), year11 = c(2009, 2015, 1929, 2000, 1951, 1954), dec = c(343.1,
266, 246.8, 246.5, 238.9, 238), year12 = c(2015, 1993, 1868,
1986, 1929, 2006), win = c(688.7, 625.9, 582.1, 560, 558.4, 546.6
), year13 = c(2016, 1995, 2014, 1990, 2020, 1877), spr = c(457.3,
408.9, 375.8, 372.8, 371.8, 371.8), year14 = c(1920, 1947, 1979,
2006, 1981, 1913), sum = c(499.7, 489.5, 483.8, 468.6, 452.1,
446.9), year15 = c(1879, 2012, 1956, 1912, 1927, 2020), aut = c(673,
668.7, 580.8, 567.9, 560.8, 554.5), year16 = c(2000, 1954, 1872,
1935, 1903, 1981), ann = c(1758.2, 1691.1, 1690, 1660.7, 1648.5,
1624.6), year17 = c(1872, 1954, 2000, 1877, 1903, 2012)), row.names = c(NA,
6L), class = "data.frame")
Expected output:
jan year feb year2 mar year3 ...
205.3 2014 287.6 2020 187.8 2019 ...
203.2 2016 188.2 2014 NA NA ...
Upvotes: 1
Views: 67
Reputation: 887118
Perhaps, we can use indexing to replace the values where the corresponding 'year' value is less than or equal to 2011
i1 <- grep("^year", names(rank_data))
i2 <- i1 -1
tmp <- NA^(rank_data[i1] <= 2011)
rank_data[i2] <- rank_data[i2] * tmp
rank_data[i1] <- rank_data[i1] * tmp
If we want to remove columns that are all NA
along with the 'year'
i3 <- rep(sapply(rank_data[i2], function(x) any(!is.na(x))), each = 2)
lst1 <- lapply(rank_data[i3], function(x) x[complete.cases(x)])
mx <- max(lengths(lst1))
do.call(cbind, lapply(lst1, `length<-`, mx))
-output
# feb year2 mar year3 apr year4 may year5 jun year6 nov year11 dec year12 win year13 sum year15 ann year17
#[1,] 287.6 2020 187.8 2019 128.4 2012 141.8 2015 205.6 2012 260.9 2015 343.1 2015 688.7 2016 489.5 2012 1624.6 2012
#[2,] NA NA NA NA NA NA NA NA NA NA NA NA NA NA 582.1 2014 446.9 2020 NA NA
#[3,] NA NA NA NA NA NA NA NA NA NA NA NA NA NA 558.4 2020 NA NA NA NA
Upvotes: 3