mehmo
mehmo

Reputation: 485

repeating row of a column in a data frame n as the same length of another column

suppose I have these data:

year<- c(2000,2000,2000,2001,2001,2001,2002,2002,2002)
H<- c(1.5,2.5,3,4.7,5.7,6.5,3.2,2.1,1.9)
a<- c(11:19)
b<- c(21:29)

df<- data_frame(year,H,a,b)
df
# A tibble: 9 × 4
   year     H     a     b
  <dbl> <dbl> <int> <int>
1  2000   1.5    11    21
2  2000   2.5    12    22
3  2000   3      13    23
4  2001   4.7    14    24
5  2001   5.7    15    25
6  2001   6.5    16    26
7  2002   3.2    17    27
8  2002   2.1    18    28
9  2002   1.9    19    29

in R how can I repeat H for each year in such a way that for each year the group of data in a and b be repeated. my expected output is like this:

    year     H     a     b
   <dbl> <dbl> <dbl> <dbl>
 1  2000   1.5    11    21
 2  2000   1.5    12    22
 3  2000   1.5    13    23
 4  2000   2.5    11    21
 5  2000   2.5    12    22
 6  2000   2.5    13    23
 7  2000   3      11    21
 8  2000   3      12    22
 9  2000   3      13    23
10  2001   4.7    14    24
11  2001   4.7    15    25
12  2001   4.7    16    26
13  2001   5.7    14    24
14  2001   5.7    15    25
15  2001   5.7    16    26
16  2001   6.5    14    24
17  2001   6.5    15    25
18  2001   6.5    16    26
19  2002   3.2    17    27
20  2002   3.2    18    28
21  2002   3.2    19    29
22  2002   2.1    17    27
23  2002   2.1    18    28
24  2002   2.1    19    29
25  2002   1.9    17    27
26  2002   1.9    18    28
27  2002   1.9    19    29

Upvotes: 1

Views: 53

Answers (2)

PaulS
PaulS

Reputation: 25473

Another solution:

library(dplyr)

year<- c(2000,2000,2000,2001,2001,2001,2002,2002,2002)
H<- c(1.5,2.5,3,4.7,5.7,6.5,3.2,2.1,1.9)
a<- c(11:19)
b<- c(21:29)

df<- data.frame(year,H,a,b)

df %>% 
  group_by(year) %>% 
  summarise(H = rep(H, each = n()), across(-1, ~ rep(.x, n())), .groups = "drop")

#> # A tibble: 27 × 4
#>     year     H     a     b
#>    <dbl> <dbl> <int> <int>
#>  1  2000   1.5    11    21
#>  2  2000   1.5    12    22
#>  3  2000   1.5    13    23
#>  4  2000   2.5    11    21
#>  5  2000   2.5    12    22
#>  6  2000   2.5    13    23
#>  7  2000   3      11    21
#>  8  2000   3      12    22
#>  9  2000   3      13    23
#> 10  2001   4.7    14    24
#> # … with 17 more rows

Upvotes: 1

lroha
lroha

Reputation: 34586

You can use tidyr::expand_grid() which accepts data frames. In this case, group by year, and then iterate over the groups with group_modify().

library(dplyr)
library(tidyr)

df %>%
  group_by(year) %>%
  group_modify(~ expand_grid(.x[1], .x[-1]))

# A tibble: 27 x 4
# Groups:   year [3]
    year     H     a     b
   <dbl> <dbl> <int> <int>
 1  2000   1.5    11    21
 2  2000   1.5    12    22
 3  2000   1.5    13    23
 4  2000   2.5    11    21
 5  2000   2.5    12    22
 6  2000   2.5    13    23
 7  2000   3      11    21
 8  2000   3      12    22
 9  2000   3      13    23
10  2001   4.7    14    24
# ... with 17 more rows

Or same idea without using group_modify() which is an experimental function:

library(purrr)

df %>%
  split(~ year) %>%
  map_df(~ expand_grid(.x[1:2], .x[3:4]))

Upvotes: 1

Related Questions