Display name
Display name

Reputation: 4501

Does dplyr `rowwise()` group in the same way `group_by()` groups?

library(tidyverse)
mtcars %>% group_by(cyl) %>% is_grouped_df()
#> [1] TRUE

I can group a data frame by a variable and confirm if it is grouped using the is_grouped_df() function (shown above).

I can run the same analysis on the dplyr rowwise() function and it appears that rowwise() does not group data sets by row. I have a question and a reading of the help page (?rowwise) does not clearly answer the question for me.

Group input by rows

Description: rowwise() allows you to compute on a data frame a row-at-a-time. This is most useful when a vectorised function doesn't exist.

A row-wise tibble maintains its row-wise status until explicitly removed by group_by(), ungroup(), or as_tibble().

My question: After calling the rowwise() function do I need to call the ungroup() function later in my pipe to ungroup my data set? Or is this done by default? The following pipe suggests that a pipe containing rowwise() is not grouped:

mtcars %>% rowwise() %>% is_grouped_df()
#> [1] FALSE

This sentence is confusing me, "A row-wise tibble maintains its row-wise status until explicitly removed by... ungroup()...". Why would I need to ungroup() a tibble that is already ungrouped?

Upvotes: 1

Views: 1074

Answers (1)

Amit Kohli
Amit Kohli

Reputation: 2950

Interesting observation. This might be a bug of is_grouped_df unless it's somehow a feature that I don't know about. But I DO think it's important to ungroup considering the testing done below (see comments):

library(tidyverse)

mtcars %>% select(1:3) %>% rowwise() %>% head(2)
#> Source: local data frame [2 x 3]
#> Groups: <by row>
##### ^ THIS DOES HAVE A GROUP ####
#> 
#> # A tibble: 2 x 3
#>     mpg   cyl  disp
#>   <dbl> <dbl> <dbl>
#> 1    21     6   160
#> 2    21     6   160

mtcars %>% select(1:3) %>% rowwise() %>% mutate(n()) %>% head(2)
#> Source: local data frame [2 x 4]
#> Groups: <by row>
#> 
#> # A tibble: 2 x 4
#>     mpg   cyl  disp `n()`
#>   <dbl> <dbl> <dbl> <int>
#> 1    21     6   160     1
#> 2    21     6   160     1
mtcars %>% select(1:3) %>% mutate(n()) %>% head(2)                                              
#>   mpg cyl disp n()
#> 1  21   6  160  32
#> 2  21   6  160  32

##### ^ THIS IS EXPECTED AND THE n BEHAVES DIFFERENTLY WHEN THE ROWWISE() IS APPLIED ####

##### IF WE WANT TO RESTORE "NORMAL" BEHAVIOR, IT'S PROBABLY WISE TO UNGROUP IN ORDER TO LOSE THE ROWWISE OPERATIONS #####
mtcars %>% select(1:3) %>% rowwise() %>% ungroup %>% mutate(n()) %>% head(2)
#> # A tibble: 2 x 4
#>     mpg   cyl  disp `n()`
#>   <dbl> <dbl> <dbl> <int>
#> 1    21     6   160    32
#> 2    21     6   160    32

## ^ NORMAL AFTER UNGROUP

Upvotes: 2

Related Questions