hachiko
hachiko

Reputation: 757

R spread ddply fivenum result

I split a data frame and recombined using the ddply function. I applied the fivenum function so that I could see min, first, median, third, max values of each variable.

d <- ddply(sara_data_gathered, "Variable", summarise, fivenum = fivenum(Percent))

I'm wondering now how I can spread this data frame so that each value (min, first, median...) is presented as its own variable. So I'm looking for a table with six total columns. I thought tidyr might be a good place to look but I don't think I have a labeled column for this. So first I'm trying to label a new column...

I tried using mutate and the rep command but you can see from the output it's not working :/

d <- d %>% 
  mutate(Position = rep(c("Minimum", "First Quartile", "Median", "Third Quartile", "Maximum"), each = 5))
d

Variable fivenum Position Aromatics 1.0 Minimum
Aromatics 19.0 Minimum
Aromatics 28.0 Minimum
Aromatics 41.0 Minimum
Aromatics 67.0 Minimum
Asphaltenes 0.0 First Quartile
Asphaltenes 1.0 First Quartile
Asphaltenes 8.0 First Quartile
Asphaltenes 30.5 First Quartile
Asphaltenes 93.0 First Quartile

Upvotes: 1

Views: 417

Answers (2)

AlexB
AlexB

Reputation: 3269

An alternative would be to simply use tapply function from base R:

do.call(rbind, tapply(mtcars$mpg, mtcars$cyl, summary))

#    Min. 1st Qu. Median     Mean 3rd Qu. Max.
# 4 21.4   22.80   26.0 26.66364   30.40 33.9
# 6 17.8   18.65   19.7 19.74286   21.00 21.4
# 8 10.4   14.40   15.2 15.10000   16.25 19.2

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389235

plyr has been retired, you can use dplyr and if you are on 1.0.0 you can return multiple rows in summarise. We can then get data in wide format using pivot_wider.

library(dplyr)

mtcars %>%
  group_by(cyl) %>%
  summarise(fivenum = fivenum(mpg), 
            Position = c("Minimum", "First Quartile", "Median", "Third Quartile", "Maximum")) %>%
  tidyr::pivot_wider(names_from = Position, values_from = fivenum)

#    cyl Minimum `First Quartile` Median `Third Quartile` Maximum
#  <dbl>   <dbl>            <dbl>  <dbl>            <dbl>   <dbl>
#1     4    21.4             22.8   26               30.4    33.9
#2     6    17.8             18.6   19.7             21      21.4
#3     8    10.4             14.3   15.2             16.4    19.2

Upvotes: 4

Related Questions