Reputation: 8626
I am wondering what would be the best way to add, for example, columns of quantiles to the dataset. I was thinking to use ave() function for that, something like ave(iris$Sepal.Length, iris$Species, FUN=quantile)
- but in this case ave()
merges values returned by quantile()
(which in this case returns 5 values per subset) and cut them for the length of iris
...
Thanks in advance for suggestions!
Upvotes: 3
Views: 349
Reputation: 226192
There are a lot of SO questions on this general topic, recommending various uses of ave()
, aggregate()
, plyr()
, reshape2::cast
, or data.table
depending on personal preference, readability, compactness, flexibility, speed ... Here's a simple solution with aggregate()
that seems to do what you want:
(aa <- aggregate(Sepal.Length~Species,data=iris,quantile))
## Species Sepal.Length.0% Sepal.Length.25% Sepal.Length.50% Sepal.Length.75%
## 1 setosa 4.300 4.800 5.000 5.200
## 2 versicolor 4.900 5.600 5.900 6.300
## 3 virginica 4.900 6.225 6.500 6.900
## Sepal.Length.100%
## 1 5.800
## 2 7.000
## 3 7.900
edit: re-reading/based on comment, this is not what you want: you need the summarized values replicated for each row, not just once per group.
Perhaps
merge(iris,aa,by="Species")
although that gives a slightly weird data frame (the last "column" is actually a matrix).
It's a little bit magical, but
merge(iris,with(aa,data.frame(Species,Sepal.Length)))
is better -- it unpacks the weird data frame returned by aggregate()
a bit more (the names are still a bit wonky).
Upvotes: 4
Reputation: 162321
With the data.table package:
library(data.table)
dt <- data.table(iris)
dt[, paste0("q", 25*(0:4)) := as.list(quantile(Sepal.Length)), by="Species"]
Upvotes: 3