Reputation: 553
Im wondering if there is a more elegant way to perform this.
Right now, I am grouping all observations by Species
. Then I summarize the median values.
median <- iris %>%
group_by(Species) %>%
summarise(medianSL = median(Sepal.Length),
medianSW = median(Sepal.Width),
medianPL = median(Petal.Length),
medianPW = median(Petal.Width))
I also wanted a column (n) that shows the amount of flowers in each row:
median_n <- iris %>%
group_by(Species) %>%
tally()
Can I combine these two code chunks? So that way the above code chunk will generate a table with the median lengths AND the total n for each row?
Upvotes: 0
Views: 294
Reputation: 886938
We may use across
in summarise
to loop over the numeric columns to get the median
as well as create a frequency count with n()
outside the across
library(dplyr)
library(stringr)
iris %>%
group_by(Species) %>%
summarise(across(where(is.numeric),
~ median(.x, na.rm = TRUE),
.names = "median{str_remove_all(.col, '[a-z.]+')}"),
n = n(), .groups = "drop")
-output
# A tibble: 3 × 6
Species medianSL medianSW medianPL medianPW n
<fct> <dbl> <dbl> <dbl> <dbl> <int>
1 setosa 5 3.4 1.5 0.2 50
2 versicolor 5.9 2.8 4.35 1.3 50
3 virginica 6.5 3 5.55 2 50
Upvotes: 4