Reputation: 757
I split a data frame and recombined using the ddply function. I applied the fivenum function so that I could see min, first, median, third, max values of each variable.
d <- ddply(sara_data_gathered, "Variable", summarise, fivenum = fivenum(Percent))
I'm wondering now how I can spread this data frame so that each value (min, first, median...) is presented as its own variable. So I'm looking for a table with six total columns. I thought tidyr might be a good place to look but I don't think I have a labeled column for this. So first I'm trying to label a new column...
I tried using mutate and the rep command but you can see from the output it's not working :/
d <- d %>%
mutate(Position = rep(c("Minimum", "First Quartile", "Median", "Third Quartile", "Maximum"), each = 5))
d
Variable fivenum Position
Aromatics 1.0 Minimum
Aromatics 19.0 Minimum
Aromatics 28.0 Minimum
Aromatics 41.0 Minimum
Aromatics 67.0 Minimum
Asphaltenes 0.0 First Quartile
Asphaltenes 1.0 First Quartile
Asphaltenes 8.0 First Quartile
Asphaltenes 30.5 First Quartile
Asphaltenes 93.0 First Quartile
Upvotes: 1
Views: 417
Reputation: 3269
An alternative would be to simply use tapply
function from base R:
do.call(rbind, tapply(mtcars$mpg, mtcars$cyl, summary))
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 4 21.4 22.80 26.0 26.66364 30.40 33.9
# 6 17.8 18.65 19.7 19.74286 21.00 21.4
# 8 10.4 14.40 15.2 15.10000 16.25 19.2
Upvotes: 2
Reputation: 389235
plyr
has been retired, you can use dplyr
and if you are on 1.0.0 you can return multiple rows in summarise
. We can then get data in wide format using pivot_wider
.
library(dplyr)
mtcars %>%
group_by(cyl) %>%
summarise(fivenum = fivenum(mpg),
Position = c("Minimum", "First Quartile", "Median", "Third Quartile", "Maximum")) %>%
tidyr::pivot_wider(names_from = Position, values_from = fivenum)
# cyl Minimum `First Quartile` Median `Third Quartile` Maximum
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 21.4 22.8 26 30.4 33.9
#2 6 17.8 18.6 19.7 21 21.4
#3 8 10.4 14.3 15.2 16.4 19.2
Upvotes: 4