DonMP
DonMP

Reputation: 317

Calculating the P-value for goodness of fits using Schillings equation

A really nice recent paper in Nature Methods gives a new method for calculating the goodness of fit when the experimental error is not well estimated: http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3358.html

I am trying to incorporate that into my fitting routines in MATLAB but my inferior intellect is really struggling with coding the equation right. the equations are given in the figures below Equations for Gof What I want to test is the probability of observing a continuous patch (Rn) given that the largest observed patch was C.

I have made some feeble attempts to code this equation in MATLAB but the results are not impressive. I am struggling with determining An in the above equations because i cannot get the switch between the equations when n>C and n<=C right. The code i have is shown below UPDATE: I have updated the code, so it no longer evaluates to negative probabilities. But it is still no working correctly If started by creating a simple function which evaluates the sum for n

function [ an ] = smalSchill( N, C)
% Small function used to evaluate correlation map statistics

an = 0;
for j = 0:C
    an = an + 2^(N-1-j);
end

end

It is then called in my script

C = 10; % observered continoues patch of either positive or negative correlation
N = 100; % length(Q); % Number of observations
g = 1; % dummy variable used to index the P-values
Pv = ones(N,1); % Vector holding the p-values. P = 1 for all n <= C

for n = C+1:N
    An2 = 0;
    for j = 0:C

        if n-1-j <= C
            An2 = An2 + smalSchill(n-1-j,C);
            break
        else
            iter_count = n-1-j-C;
            An2 = An2 + iter_count*smalSchill(C,C); %iter_count            
        end
    end
    Pv2(n) = 1-An2/2^n;
end

The P-values decay as expected to begin with but as n grows, the P-values converge to 1 which is not correct - they should converge to 0.

cheers!

Upvotes: 1

Views: 253

Answers (1)

DonMP
DonMP

Reputation: 317

I over-complicated everything - and misunderstood the meaning of two of the variables. No wonder my code did work. But i got it to work today

N = 1000 % Length of path investigated
patchL = 1:1:20; % length of continious patch which you are interested in testing.
An = zeros(N+1,length(patchL)); % preallocate memory

for zz = 1:length(patchL)
    x = patchL(zz); 

    for n = 0:N

        if n <= x;
            An(n+1,zz) = 2^n;
        else
            An(n+1,zz) = sum(An(n-x:n,zz));
        end
    end
end

AnDist = An(end,:); % supposing last entry (20) is what you are interested in
Dist = AnDist - [0, AnDist(1:end-1)]; % go from cummulative dist to mdf
Dist = Dist./sum(Dist); % normalizing  the mdf  

Upvotes: 1

Related Questions