Fitting distributions in R and Matlab give very different results

Question

I am fitting a wave height dataset (m) to a generalized pareto distribution. I already selected a threshold at 0.15m. I'm transitioning from Matlab to R so I wanted to compare the results from both programs. I knew that both programs wouldn't provide the same exact answer but I was expecting to get results somewhat close to each other.

Of course, for both fittings, I used the same data and the same threshold. Both methods estimate the parameters by MLE. In Matlab, the fitdist function does not estimate the threshold. It is assumed to be known (which in this case is 0.15m), and subtracted from data before calling it (which I already did). While in R the threshold is part of the input.

The functions I used are: Matlab: fitdist(data_above_threshold,'Generalized Pareto') R: fevd(data,threshold =0.15,type = 'GP',method = "MLE"). (extRemes Package)

In Matlab I get: Shape:-1.0301 Scale:0.7534 while in R I get Shape:-0.5848505 Scale:0.3620191

Here I attach the data so you can see the results for yourselves:

c(0.194101337780203, 0.289791483274648, 0.313940773547535, 0.184577674010614, 
0.102266008573448, 0.045464156804826, 0.0486387113946889, 0.143761972140945, 
0.267342847246331, 0.242966803074167, 0.069386693178437, 0.0099771715681416, 
0.00736950172646855, 0.056121590070795, 0.0759625562574393, 0.145009118586962, 
0.258045937376017, 0.379926158236833, 0.236844447793717, 0.0861664817248564, 
0.0503393656392586, 0.156233436601121, 0.118138781522764, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
0.0442170103588082, 4.44089209850063e-16, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 0.0143988726040223, 
0.148183673176825, 0.197729400168618, 0.0143988726040223, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 0.026416829265647, 
0.133558046673528, 0.209180472082052, 0.236050809146251, 0.0976175536382913, 
0.0292512530065965, 0.101812500774896, 0.174260371593558, 0.219271020599832, 
0.463144839271102, 0.579809720448572, 0.533211794147367, 0.354756475417204, 
0.360085192050189, 0.632756755929504, 0.651577329569406, 0.413372358380034, 
0.39353139219339, 0.343985665201597, 0.0630375839987112, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
4.44089209850063e-16, 0.0486387113946889, 0.0434233717113424, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
0.0566884748189849, 0.147730165378273, 0.175734271938852, 0.0827651732357175, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
0.0385481628769098, 0.0378679011790819, 0.0463711724019298, 0.0200677200859207, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 0.0956901454944461, 
0.128909591738371, 0.141154302299272, 0.0459176646033779, 4.44089209850063e-16, 
0.0285709913087686, 0.251696828196291, 0.642847304447283, 0.731394702114536, 
0.603732256822183, 0.427997984883332, 0.635251048821539)

So, my questions are: If both use MLE to estimate the parameters, then why are the results so different? Then which fit is more reliable or which one do you recommend to use?

Fitting distributions in R and Matlab give very different results

Answers (1)

Related Questions