Reputation: 165
This is regarding the R package for the Pareto distribution. I have come across two packages so far and both give results in the same range more or less. My question is if there is a difference using either package.
An example code snippet for Pareto:
rPareto(n, t, alpha, truncation = NULL)
Example code snippet for pareto:
rpareto(n, location, shape = 1)
Upvotes: 1
Views: 300
Reputation: 3335
We can test which samples from rpareto
from EnvStats
or rPareto
from Pareto
give closer approximations to the true Pareto distribution. The true Pareto distribution has cdf 1-(xm/x)^a where xm is scale and a is shape. A Kolmogorov-Smirnov test can be used to compare the samples to this true cdf from either function. Based on the p-value, we can say which sample is 'closer' to the true cdf (with higher p-value being closer). (Of course, this is just one way to go about doing this.)
paretocdf = function(x, xm, a) {
return(1-(xm/x)^a)
}
library(Pareto)
library(EnvStats)
pvalue1 = NULL
pvalue2 = NULL
for (i in 1:1000) {
X=unique(rPareto(10000, 1000, 2))
Y=unique(rpareto(10000, 1000, 2))
pvalue1[i] = ks.test(X, paretocdf, 1000, 2)$p.value
pvalue2[i] = ks.test(Y, paretocdf, 1000, 2)$p.value
}
mean(pvalue1 > pvalue2)
According to my analysis, both functions are approximately the same.
Upvotes: 1