Reputation: 85
I am looking for some guiding regarding histogram plot.
Lets assume I have this vecotr (called CF)
[,1]
[1,] 2275.351
[2,] 2269.562
[3,] 1925.700
[4,] 1904.195
[5,] 1974.039
I use the following formula to plot this vector in a histogram plot.
hist(CF)
Let us now assume I have 10 000 simulated value estimates for a property. I want to plot those in a histogram (or similar plots) where the x-axis returns the probabilities.
Such plot will give med the opportunity to state something like: "with 55% probability, the value of the property exceeds $15 million.
Suggerstions?
Upvotes: 2
Views: 515
Reputation: 11893
I agree with @Stibu that you want the CDF. When you are talking about a set of realized data, we refer to this as the empirical cumulative distribution function (ECDF). In R, the basic function call for this is ?ecdf:
CF <- read.table(text="[1,] 2275.351
[2,] 2269.562
[3,] 1925.700
[4,] 1904.195
[5,] 1974.039", header=F)
CF <- as.vector(CF[,-1])
CF # [1] 2275.351 2269.562 1925.700 1904.195 1974.039
windows()
plot(ecdf(CF))
If you are willing to download the fitdistrplus package, there are a lot of fancy versions you can play with:
library(fitdistrplus)
windows()
plotdist(CF)
fdn <- fitdist(CF, "norm")
fdw <- fitdist(CF, "weibull")
summary(fdw)
# Fitting of the distribution ' weibull ' by maximum likelihood
# Parameters :
# estimate Std. Error
# shape 13.59732 4.833605
# scale 2149.24253 74.958140
# Loglikelihood: -32.89089 AIC: 69.78178 BIC: 69.00065
# Correlation matrix:
# shape scale
# shape 1.0000000 0.3328979
# scale 0.3328979 1.0000000
windows()
plot(fdn)
windows()
cdfcomp(list(fdn,fdw), legendtext=c("Normal","Weibull"), lwd=2)
Upvotes: 2
Reputation: 15897
What you probably want is the cumulative distribution function (CDF). It has probability on the y-axis (not x, as you asked), but since this is the standard way to represent the information that you want, it is best to use this curve.
As an example, I produced 10'000 values with a standard normal distribution and then constructed the CDF:
CF <- rnorm(10000)
breaks <- seq(-4,4,0.5)
CDF <- sapply(breaks,function(b) sum(CF<=b)/length(CF))
plot(breaks,CDF,type="l")
From the plot, you can for instance read off that with probability of 50%, a value below zero has been drawn.
If you prefer a bar plot, you can plot with
barplot(CDF,names.arg=breaks)
I don't know your data in detail, so I can not give you more precise code. But basically, you will have to pick a reasonable set of breaks, and then apply the code above.
Upvotes: 5