Reputation: 1875
I'm interested in calculating the quantile
function of a column in a data frame for only a subset of rows based upon another column.
For example, I have a new_user_indicator
column with "Y" or "N", and want to know the quantile for "Y" group. Currently I am doing
quantile(subset_df$limit_amount, .25)
subset_df <- subset(carddata, new_user_indicator == "Y")
Is there a way to do this in one command rather than creating a subsetted data frame?
I looked at this to see if it could help but wasn't able to decipher part of the code.
Thanks
Upvotes: 0
Views: 815
Reputation: 73285
Quantile function itself does not allow you operate on a subset. So you do need some way to extract subset data.
However, it is not recommended to extract a subset data frame, as you did. quantile
accepts a vector, so you only need to subset a column rather than the whole data frame.
quantile(with(carddata, limit_amount[new_user_indicator == "Y"]), 0.25)
The with
function helps extract column, otherwise you need
quantile(carddatal$imit_amount[carddata$new_user_indicator == "Y"], 0.25)
update
If you are to do this repeatedly, then write a function (change function name foo
to your favourite)
foo <- function(df, out_var, in_var, in_level, prob) {
quantile(df[[out_var]][df[[in_var]] == in_level], prob)
}
Then you can do:
foo(carddata, "limit_amount", "new_user_indicator", "Y", 0.25)
I am assuming you have another level "N", so for that level you can do
foo(carddata, "limit_amount", "new_user_indicator", "N", 0.25)
Here, out_var
, in_var
are column names (hence a string) for output variable an input variable. in_level
is the level for input variable. And you know what prob
is for.
a more powerful way
If you want a 0.25 for all levels of input variable, then using my function is yet stupid. Use tapply
tapply(carddata$limit_amount, cardata$new_user_indicator, FUN = quantile, prob = 0.25)
tapply(x1, x2, FUN, ...)
will apply quantile(x1, ...)
to according to x2
. If you have 10 levels in x2
, then you get 0.25 quantile for all of them.
Upvotes: 1