Dan
Dan

Reputation: 1778

Add row in each group using dplyr and add_row()

If I add a new row to the iris dataset with:

iris <- as_tibble(iris)

> iris %>% 
    add_row(.before=0)

# A tibble: 151 × 5
    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl>   <chr>
1            NA          NA           NA          NA    <NA> <--- Good!
2           5.1         3.5          1.4         0.2  setosa
3           4.9         3.0          1.4         0.2  setosa

It works. So, why can't I add a new row on top of each "subset" with:

iris %>% 
 group_by(Species) %>% 
 add_row(.before=0)

Error: is.data.frame(df) is not TRUE

Upvotes: 38

Views: 25838

Answers (4)

stallingOne
stallingOne

Reputation: 4006

In newer versions of R, his is how to do it using reframe().
Replace 0 with any value you might need.

iris %>%
  reframe(
    Sepal.Length = c(Sepal.Length, 0),
    Sepal.Width = c(Sepal.Width, 0), 
    Petal.Length = c(Petal.Length, 0),
    Petal.Width = c(Petal.Width, 0),
    .by = c(Species) ## reframe ungroups automatically, therefore safer to write it here to be reproducible line
  )

This method solves this warning:

Warning message: Returning more (or less) than 1 row per summarise() group was deprecated in dplyr 1.1.0.

Upvotes: 1

Anoushiravan R
Anoushiravan R

Reputation: 21908

With a slight variation, this could also be done:

library(purrr)
library(tibble)

iris %>%
  group_split(Species) %>%
  map_dfr(~ .x %>%
            add_row(.before = 1))

# A tibble: 153 x 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
 1         NA          NA           NA          NA   NA     
 2          5.1         3.5          1.4         0.2 setosa 
 3          4.9         3            1.4         0.2 setosa 
 4          4.7         3.2          1.3         0.2 setosa 
 5          4.6         3.1          1.5         0.2 setosa 
 6          5           3.6          1.4         0.2 setosa 
 7          5.4         3.9          1.7         0.4 setosa 
 8          4.6         3.4          1.4         0.3 setosa 
 9          5           3.4          1.5         0.2 setosa 
10          4.4         2.9          1.4         0.2 setosa 
# ... with 143 more rows

This also can be used for grouped data frame, however, it's a bit verbose:

library(dplyr)

iris %>%
  group_by(Species) %>%
  summarise(Sepal.Length = c(NA, Sepal.Length), 
            Sepal.Width = c(NA, Sepal.Width), 
            Petal.Length = c(NA, Petal.Length),
            Petal.Width = c(NA, Petal.Width), 
            Species = c(NA, Species))

Upvotes: 6

Alexlok
Alexlok

Reputation: 3134

A more recent version would be using group_modify() instead of do().

iris %>%
  as_tibble() %>%
  group_by(Species) %>% 
  group_modify(~ add_row(.x,.before=0))
#> # A tibble: 153 x 5
#> # Groups:   Species [3]
#>    Species Sepal.Length Sepal.Width Petal.Length Petal.Width
#>    <fct>          <dbl>       <dbl>        <dbl>       <dbl>
#>  1 setosa          NA          NA           NA          NA  
#>  2 setosa           5.1         3.5          1.4         0.2
#>  3 setosa           4.9         3            1.4         0.2

Upvotes: 38

konvas
konvas

Reputation: 14346

If you want to use a grouped operation, you need do like JasonWang described in his comment, as other functions like mutate or summarise expect a result with the same number of rows as the grouped data frame (in your case, 50) or with one row (e.g. when summarising).

As you probably know, in general do can be slow and should be a last resort if you cannot achieve your result in another way. Your task is quite simple because it only involves adding extra rows in your data frame, which can be done by simple indexing, e.g. look at the output of iris[NA, ].

What you want is essentially to create a vector

indices <- c(NA, 1:50, NA, 51:100, NA, 101:150)

(since the first group is in rows 1 to 50, the second one in 51 to 100 and the third one in 101 to 150).

The result is then iris[indices, ].

A more general way of building this vector uses group_indices.

indices <- seq(nrow(iris)) %>% 
    split(group_indices(iris, Species)) %>% 
    map(~c(NA, .x)) %>%
    unlist

(map comes from purrr which I assume you have loaded as you have tagged this with tidyverse).

Upvotes: 20

Related Questions