Display name
Display name

Reputation: 4501

dplyr `across()` function requires `rowwise()` function yes?

packageVersion("dplyr")
#[1] ‘0.8.99.9002’

Please note that this question uses dplyr's new across() function. To install the latest dev version of dplyr issue the remotes::install_github("tidyverse/dplyr") command. To restore to the released version of dplyr issue the install.packages("dplyr") command. If you are reading this some point in the future and are already on dplyr 1.X+ you won't need to worry about this note.

library(tidyverse)
WorldPhones %>% 
  as.data.frame() %>% 
  rowwise() %>% 
  mutate(mean = mean(c_across(N.Amer:Mid.Amer), na.rm = TRUE))
#> # A tibble: 7 x 8
#> # Rowwise: 
#>   N.Amer Europe  Asia S.Amer Oceania Africa Mid.Amer   mean
#>    <dbl>  <dbl> <dbl>  <dbl>   <dbl>  <dbl>    <dbl>  <dbl>
#> 1  45939  21574  2876   1815    1646     89      555 10642 
#> 2  60423  29990  4708   2568    2366   1411      733 14600.
#> 3  64721  32510  5230   2695    2526   1546      773 15714.
#> 4  68484  35218  6662   2845    2691   1663      836 16914.
#> 5  71799  37598  6856   3000    2868   1769      911 17829.
#> 6  76036  40341  8220   3145    3054   1905     1008 19101.
#> 7  79831  43173  9053   3338    3224   2005     1076 20243.

This article by Dr Keith McNulty provides a good example (shown above) of working with dplyr's new c_across() function. You go across each row and R calculates the mean between the selected columns.

Let's do the same thing with the mtcars data frame, instead selecting the max value across columns for each row. We'll only select the "drat" and "wt" variables to keep things simple.

mtcars %>% 
  select(drat, wt) %>% 
  as_tibble() %>% 
  mutate(max = max(c_across(drat:wt), na.rm = TRUE))
#> # A tibble: 32 x 3
#>     drat    wt   max
#>    <dbl> <dbl> <dbl>
#>  1  3.9   2.62  5.42
#>  2  3.9   2.88  5.42
#>  3  3.85  2.32  5.42
#>  4  3.08  3.22  5.42
#>  5  3.15  3.44  5.42
#>  6  2.76  3.46  5.42
#>  7  3.21  3.57  5.42
#>  8  3.69  3.19  5.42
#>  9  3.92  3.15  5.42
#> 10  3.92  3.44  5.42
#> # ... with 22 more rows

Why isn't dplyr selecting the max value across each row, and displaying that in the max column? What I want would look like this.

#> # A tibble: 32 x 3
#>     drat    wt   max 
#>    <dbl> <dbl> <dbl>
#>  1  3.9   2.62   3.9
#>  2  3.9   2.88   3.9
#>  3  3.85  2.32  3.85
#>  4  3.08  3.22  3.22
#>  5  3.15  3.44  3.44
#>  6  2.76  3.46  3.46
#>  7  3.21  3.57  3.57
#>  8  3.69  3.19  3.69
#>  9  3.92  3.15  3.92
#> 10  3.92  3.44  3.92
#> # ... with 22 more rows

How can I do this? c_across worked on worldphones but isn't working on mtcars. And we'll define "working" as "doing what I want".

Upvotes: 1

Views: 109

Answers (1)

Sergio Romero
Sergio Romero

Reputation: 368

You don't have the rowwise part. Remember that c_across is only valid for rowwise operations (which is what you want right now).

mtcars %>% 
    select(drat, wt) %>% 
    rowwise() %>% 
    mutate(max = max(c_across(drat:wt), na.rm = TRUE))

# A tibble: 32 x 3
# Rowwise: 
    drat    wt   max
   <dbl> <dbl> <dbl>
 1  3.9   2.62  3.9 
 2  3.9   2.88  3.9 
 3  3.85  2.32  3.85
 4  3.08  3.22  3.22
 5  3.15  3.44  3.44
 6  2.76  3.46  3.46
 7  3.21  3.57  3.57
 8  3.69  3.19  3.69
 9  3.92  3.15  3.92
10  3.92  3.44  3.92
# ... with 22 more rows

Upvotes: 2

Related Questions