Reputation: 4501
library(tidyverse)
mtcars %>% group_by(cyl) %>% is_grouped_df()
#> [1] TRUE
I can group a data frame by a variable and confirm if it is grouped using the is_grouped_df()
function (shown above).
I can run the same analysis on the dplyr rowwise()
function and it appears that rowwise()
does not group data sets by row. I have a question and a reading of the help page (?rowwise
) does not clearly answer the question for me.
Group input by rows
Description: rowwise() allows you to compute on a data frame a row-at-a-time. This is most useful when a vectorised function doesn't exist.
A row-wise tibble maintains its row-wise status until explicitly removed by group_by(), ungroup(), or as_tibble().
My question: After calling the rowwise()
function do I need to call the ungroup()
function later in my pipe to ungroup my data set? Or is this done by default? The following pipe suggests that a pipe containing rowwise()
is not grouped:
mtcars %>% rowwise() %>% is_grouped_df()
#> [1] FALSE
This sentence is confusing me, "A row-wise tibble maintains its row-wise status until explicitly removed by... ungroup()...". Why would I need to ungroup()
a tibble that is already ungrouped?
Upvotes: 1
Views: 1074
Reputation: 2950
Interesting observation. This might be a bug of is_grouped_df
unless it's somehow a feature that I don't know about. But I DO think it's important to ungroup
considering the testing done below (see comments):
library(tidyverse)
mtcars %>% select(1:3) %>% rowwise() %>% head(2)
#> Source: local data frame [2 x 3]
#> Groups: <by row>
##### ^ THIS DOES HAVE A GROUP ####
#>
#> # A tibble: 2 x 3
#> mpg cyl disp
#> <dbl> <dbl> <dbl>
#> 1 21 6 160
#> 2 21 6 160
mtcars %>% select(1:3) %>% rowwise() %>% mutate(n()) %>% head(2)
#> Source: local data frame [2 x 4]
#> Groups: <by row>
#>
#> # A tibble: 2 x 4
#> mpg cyl disp `n()`
#> <dbl> <dbl> <dbl> <int>
#> 1 21 6 160 1
#> 2 21 6 160 1
mtcars %>% select(1:3) %>% mutate(n()) %>% head(2)
#> mpg cyl disp n()
#> 1 21 6 160 32
#> 2 21 6 160 32
##### ^ THIS IS EXPECTED AND THE n BEHAVES DIFFERENTLY WHEN THE ROWWISE() IS APPLIED ####
##### IF WE WANT TO RESTORE "NORMAL" BEHAVIOR, IT'S PROBABLY WISE TO UNGROUP IN ORDER TO LOSE THE ROWWISE OPERATIONS #####
mtcars %>% select(1:3) %>% rowwise() %>% ungroup %>% mutate(n()) %>% head(2)
#> # A tibble: 2 x 4
#> mpg cyl disp `n()`
#> <dbl> <dbl> <dbl> <int>
#> 1 21 6 160 32
#> 2 21 6 160 32
## ^ NORMAL AFTER UNGROUP
Upvotes: 2