Reputation: 113
I am trying to break a dataset into quantiles based on a group.
I have the following code which if i try to do a cut using seq(0,1,.5)
it works fine but when I change to the seq(0,1,.2)
then it gives :
Error in cut.default(x = fwd_quarts$v, breaks = quantile(fwd_quarts$v, : 'breaks' are not unique
Tring different code, I can't get away from the error. How do I adjust this so when it expands to larger data sets that the quantiles will be created without the error?
ddf <- vector(mode="numeric", length=0)
df <- vector(mode="numeric", length=0)
g<-data.frame( g= c(1,1,1,1,2,2,2,2,3,3))
v<-data.frame( v= c(1,4,4,5,NA,2,6,NA,7,8))
df<-cbind(g,v)
df<-df[complete.cases(df), ]
ddf<-ddply(df, "g", function(fwd_quarts){
eps_quartile <- cut(x = fwd_quarts$v, breaks =quantile(fwd_quarts$v, probs = seq(0, 1, 0.5)),na.rm=TRUE, labels = FALSE, include.lowest = TRUE)
cbind(ddf,eps_quartile)
})
df<-cbind(df,fwde_quart=ddf$eps_quartile)
Upvotes: 5
Views: 10120
Reputation: 2859
I got the same problem in the leaflet, if there is not enough observation to make the map it gives the same error. As a solution I just combine the clusters that having low observations.
Upvotes: 0
Reputation: 31
This has nothing to do with ddply.
If your data is not generating unique breaks, you can make them unique by wrapping the breaks with a unique statement.
breaks =unique(quantile(fwd_quarts$v, probs = seq(0, 1, 0.2)))
However, this will lower the number of levels from what you originally desired.
Generally speaking, if you have data like c(1,1,1,2) you can't break it into 3 groups. The number of groups should be less than or equal to the unique values in your data. HTH.
Upvotes: 3