Reputation: 163
I'm fitting my data to several distributions in R. The goal is to see which distribution fits my data best. The code I'm using is based on: http://www.di.fc.ul.pt/~jpn/r/distributions/fitting.html
my_data <- EP1sh
plotdist(my_data, histo = TRUE, demp = TRUE)
descdist(my_data, discrete=FALSE, boot=500)
fit_w <- fitdist(my_data, "weibull")
fit_g <- fitdist(my_data, "gamma")
fit_ln <- fitdist(my_data, "lnorm")
summary(fit_ln)
par(mfrow=c(2,2))
plot.legend <- c("Weibull", "lognormal", "gamma")
denscomp(list(fit_w, fit_g, fit_ln), legendtext = plot.legend)
cdfcomp (list(fit_w, fit_g, fit_ln), legendtext = plot.legend)
qqcomp (list(fit_w, fit_g, fit_ln), legendtext = plot.legend)
ppcomp (list(fit_w, fit_g, fit_ln), legendtext = plot.legend)
fit = fitdistr(my_data, densfun="lognormal")
My dataframe is a single vector EP1sh, which has around 80 entries with a value between 1 and 6.
I keep getting the following errors. I first thought it was because I had several 'NA' in my dataframe, but I think I solved that and the problem remains (this was how i removed NA from EP1sh:)
EP1sh <- na.omit(EP1$Number_share)
EP1sh <- data.frame(EP1sh)
errors are:
my_data <- EP1sh plotdist(my_data, histo = TRUE, demp = TRUE)
Error in plotdist(my_data, histo = TRUE, demp = TRUE) : data must be a numeric vector
descdist(my_data, discrete=FALSE, boot=500)
Error in descdist(my_data, discrete = FALSE, boot = 500) : data must be a numeric vector
fit_w <- fitdist(my_data, "weibull")
Error in fitdist(my_data, "weibull") : data must be a numeric vector of length greater than 1
fit_g <- fitdist(my_data, "gamma")
Error in fitdist(my_data, "gamma") : data must be a numeric vector of length greater than 1
fit_ln <- fitdist(my_data, "lnorm")
Error in fitdist(my_data, "lnorm") : data must be a numeric vector of length greater than 1
summary(fit_ln)
Error in summary(fit_ln) : object 'fit_ln' not found
Any ideas would be great !
Upvotes: 0
Views: 2369
Reputation: 296
In the plotdist()
, descdist()
, fitdist()
, and fitdistr()
functions you need to specify a vector, for example my_data$Number_share
instead of the entire data frame my_data
; that's assuming your "column" name is Number_share
, so modify accordingly. This should hopefully fix your issue!
Upvotes: 1