Reputation: 41
I did a box plot comparing the ages of male swimming Olympic athletes and then whether or not they earned a medal. I'm wondering how to do the code to get a five number summary for the box plot with no medal and the box plot with medal (I changed medal to a factor). I tried summary(age,medal.f)
and summary(age~medal.f)
and nothing seems to be working/I don't know how to separate the box plots. Any thoughts on how to do this?
Upvotes: 0
Views: 1409
Reputation: 226332
The easiest way to get this information is to save the result of your boxplot()
call and extract the $stats
component. Using the built-in ToothGrowth
data set,
b <- boxplot(len~supp,data=ToothGrowth)
b$stats
## [,1] [,2]
## [1,] 8.2 4.2
## [2,] 15.2 11.2
## [3,] 22.7 16.5
## [4,] 25.8 23.3
## [5,] 30.9 33.9
More generally, you can do this by hand with something like
with(data,lapply(split(age,medal),boxplot.stats))
There are many other solutions involving by()
or the plyr
, dplyr
, data.table
packages ...
Again using ToothGrowth
:
(bps <- with(ToothGrowth,lapply(split(len,supp),boxplot.stats)))
$OJ
$OJ$stats
[1] 8.2 15.2 22.7 25.8 30.9
$OJ$n
[1] 30
$OJ$conf
[1] 19.64225 25.75775
$OJ$out
numeric(0)
$VC
$VC$stats
[1] 4.2 11.2 16.5 23.3 33.9
$VC$n
[1] 30
$VC$conf
[1] 13.00955 19.99045
$VC$out
numeric(0)
If you just want the 5-number summaries, you can extract them as follows:
sapply(bps,"[[","stats")
OJ VC
[1,] 8.2 4.2
[2,] 15.2 11.2
[3,] 22.7 16.5
[4,] 25.8 23.3
[5,] 30.9 33.9
Upvotes: 6