Reputation: 2533
I have a dataframe (df) that looks as follows:
Date Model Color Value Samples
6/19/17 Gold Blue 0.50 500
6/19/17 Gold Red 1.25 449
6/19/17 Silver Blue 0.75 1320
6/19/17 Silver Blue 1.50 103
6/19/17 Gold Red 0.70 891
6/19/17 Gold Blue 0.41 18103
6/19/17 Copper Blue 0.83 564
I can output the percentile data for each Color
variable using the following:
df_subset <- subset(df, df$Color == 'Blue')
quantile(df_subset$Value, c(0.50, 0.99, 0.999, 0.9999))
But, how would I do the following?
a) add the Samples
column to the output
b) add multiple rows to the output (i.e. one row for each distinct value of the Model
variable)
An example would be as follows:
| # Samples | 50th percentile | 99th percentile | 99.9th percentile | 99.99th percentile
Gold
Silver
Copper
Thanks in advance for your help!
Upvotes: 0
Views: 1262
Reputation: 3230
library(data.table)
dat <- data.table(Date = "6/19/17",
Model = c("Gold", "Gold", "Silver", "Silver", "Gold", "Gold", "Copper"),
Color = c("Blue", "Red", "Blue", "Blue", "Red", "Blue", "Blue"),
Value = c(0.5, 1.25, .75, 1.5, .7, .41, .83),
Samples = c(500, 449, 1320, 103, 891, 18103, 564))
dat[, .(Samples = sum(Samples),
`50th percentile` = quantile(Value, probs = c(0.5)),
`99th percentile` = quantile(Value, probs = c(0.99)),
`99.9th percentile` = quantile(Value, probs = c(0.999)),
`99.99th percentile` = quantile(Value, probs = c(0.9999))),
by = Model]
Results in:
Model Samples 50th percentile 99th percentile 99.9th percentile 99.99th percentile
1: Gold 19943 0.600 1.2335 1.24835 1.249835
2: Silver 1423 1.125 1.4925 1.49925 1.499925
3: Copper 564 0.830 0.8300 0.83000 0.830000
Upvotes: 1
Reputation: 39154
A solution using dplyr
. dt2
is the final output.
dt <- read.table(text = "Date Model Color Value Samples
6/19/17 Gold Blue 0.50 500
6/19/17 Gold Red 1.25 449
6/19/17 Silver Blue 0.75 1320
6/19/17 Silver Blue 1.50 103
6/19/17 Gold Red 0.70 891
6/19/17 Gold Blue 0.41 18103
6/19/17 Copper Blue 0.83 564",
header = TRUE, stringsAsFactors = FALSE)
library(dplyr)
dt2 <- dt %>%
group_by(Model) %>%
summarise(Samples = sum(Samples),
`50th percentile` = quantile(Value, 0.5),
`99th percentile` = quantile(Value, 0.99),
`99.9th percentile` = quantile(Value, 0.999),
`99.99th percentile` = quantile(Value, 0.9999))
Upvotes: 0