user2821333
user2821333

Reputation: 41

Code for 5 number summary

Age vs. Medal Plot

I did a box plot comparing the ages of male swimming Olympic athletes and then whether or not they earned a medal. I'm wondering how to do the code to get a five number summary for the box plot with no medal and the box plot with medal (I changed medal to a factor). I tried summary(age,medal.f) and summary(age~medal.f) and nothing seems to be working/I don't know how to separate the box plots. Any thoughts on how to do this?

Upvotes: 0

Views: 1409

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226332

The easiest way to get this information is to save the result of your boxplot() call and extract the $stats component. Using the built-in ToothGrowth data set,

b <- boxplot(len~supp,data=ToothGrowth)
b$stats
##      [,1] [,2]
## [1,]  8.2  4.2
## [2,] 15.2 11.2
## [3,] 22.7 16.5
## [4,] 25.8 23.3
## [5,] 30.9 33.9

More generally, you can do this by hand with something like

with(data,lapply(split(age,medal),boxplot.stats))

There are many other solutions involving by() or the plyr, dplyr, data.table packages ...

Again using ToothGrowth:

(bps <- with(ToothGrowth,lapply(split(len,supp),boxplot.stats)))
$OJ
$OJ$stats
[1]  8.2 15.2 22.7 25.8 30.9

$OJ$n
[1] 30

$OJ$conf
[1] 19.64225 25.75775

$OJ$out
numeric(0)


$VC
$VC$stats
[1]  4.2 11.2 16.5 23.3 33.9

$VC$n
[1] 30

$VC$conf
[1] 13.00955 19.99045

$VC$out
numeric(0)

If you just want the 5-number summaries, you can extract them as follows:

 sapply(bps,"[[","stats")
       OJ   VC
[1,]  8.2  4.2
[2,] 15.2 11.2
[3,] 22.7 16.5
[4,] 25.8 23.3
[5,] 30.9 33.9

Upvotes: 6

Related Questions