maria118code
maria118code

Reputation: 163

Fitting data to distributions in R: errors

I'm fitting my data to several distributions in R. The goal is to see which distribution fits my data best. The code I'm using is based on: http://www.di.fc.ul.pt/~jpn/r/distributions/fitting.html

my_data <- EP1sh

plotdist(my_data, histo = TRUE, demp = TRUE)

descdist(my_data, discrete=FALSE, boot=500)
fit_w  <- fitdist(my_data, "weibull")
fit_g  <- fitdist(my_data, "gamma")
fit_ln <- fitdist(my_data, "lnorm")
summary(fit_ln)

par(mfrow=c(2,2))
plot.legend <- c("Weibull", "lognormal", "gamma")
denscomp(list(fit_w, fit_g, fit_ln), legendtext = plot.legend)
cdfcomp (list(fit_w, fit_g, fit_ln), legendtext = plot.legend)
qqcomp  (list(fit_w, fit_g, fit_ln), legendtext = plot.legend)
ppcomp  (list(fit_w, fit_g, fit_ln), legendtext = plot.legend)


fit = fitdistr(my_data, densfun="lognormal")

My dataframe is a single vector EP1sh, which has around 80 entries with a value between 1 and 6.

I keep getting the following errors. I first thought it was because I had several 'NA' in my dataframe, but I think I solved that and the problem remains (this was how i removed NA from EP1sh:)

EP1sh <- na.omit(EP1$Number_share)
EP1sh <- data.frame(EP1sh)

errors are:

my_data <- EP1sh plotdist(my_data, histo = TRUE, demp = TRUE)

Error in plotdist(my_data, histo = TRUE, demp = TRUE) : data must be a numeric vector

descdist(my_data, discrete=FALSE, boot=500)

Error in descdist(my_data, discrete = FALSE, boot = 500) : data must be a numeric vector

fit_w <- fitdist(my_data, "weibull")

Error in fitdist(my_data, "weibull") : data must be a numeric vector of length greater than 1

fit_g <- fitdist(my_data, "gamma")

Error in fitdist(my_data, "gamma") : data must be a numeric vector of length greater than 1

fit_ln <- fitdist(my_data, "lnorm")

Error in fitdist(my_data, "lnorm") : data must be a numeric vector of length greater than 1

summary(fit_ln)

Error in summary(fit_ln) : object 'fit_ln' not found

Any ideas would be great !

Upvotes: 0

Views: 2369

Answers (1)

Supertasty
Supertasty

Reputation: 296

In the plotdist(), descdist(), fitdist(), and fitdistr() functions you need to specify a vector, for example my_data$Number_share instead of the entire data frame my_data; that's assuming your "column" name is Number_share, so modify accordingly. This should hopefully fix your issue!

Upvotes: 1

Related Questions