Reputation: 1740
I have a dataframe that looks like this:
Var1 Var2 Var3
100 B 15
200 A 16
700 A 13
500 C 10
This is just preview data, in fact it has 10000+ rows.
I am doing the following:
data %>%
group_by(Var2) %>%
mutate(Tercile = fabricatr::split_quantile(Var3, 3)) %>%
group_by(Var2, Tercile) %>%
summarise(Var1 = mean(Var1))
This results in a following error message:
The `x` argument provided to quantile split must be non-null and length at least 2.
As far as I understand, this means that for some values of Var2
there is only 1 unique value of Var3
and the tercile split cannot be accomplished. My first question is: Is this interpretation correct? I am confused by the part that says "length at least 2"
because I expect that length should be at least 3 to perform a tercile split, right?
If the interpretation is correct, my second question is: How to automate the exclusion of such cases? I don't have nearly enough time to go through some 300 values of Var2 and examine the values of Var3. I need a coding solution that excludes such levels of Var2, so that the error mentioned previously doesn't appear.
Upvotes: 1
Views: 95
Reputation: 388982
As the error message says split_quantile
needs a vector of at least length 2 we can remove the groups which has rows less than 2 and then apply the function?
library(dplyr)
data %>%
group_by(Var2) %>%
filter(n() >= 2) %>%
mutate(Tercile = fabricatr::split_quantile(Var3, 3)) %>%
group_by(Var2, Tercile) %>%
summarise(Var1 = mean(Var1))
Upvotes: 1