Reputation: 317
A really nice recent paper in Nature Methods gives a new method for calculating the goodness of fit when the experimental error is not well estimated: http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3358.html
I am trying to incorporate that into my fitting routines in MATLAB but my inferior intellect is really struggling with coding the equation right. the equations are given in the figures below
What I want to test is the probability of observing a continuous patch (Rn) given that the largest observed patch was C.
I have made some feeble attempts to code this equation in MATLAB but the results are not impressive. I am struggling with determining An in the above equations because i cannot get the switch between the equations when n>C and n<=C right. The code i have is shown below UPDATE: I have updated the code, so it no longer evaluates to negative probabilities. But it is still no working correctly If started by creating a simple function which evaluates the sum for n
function [ an ] = smalSchill( N, C)
% Small function used to evaluate correlation map statistics
an = 0;
for j = 0:C
an = an + 2^(N-1-j);
end
end
It is then called in my script
C = 10; % observered continoues patch of either positive or negative correlation
N = 100; % length(Q); % Number of observations
g = 1; % dummy variable used to index the P-values
Pv = ones(N,1); % Vector holding the p-values. P = 1 for all n <= C
for n = C+1:N
An2 = 0;
for j = 0:C
if n-1-j <= C
An2 = An2 + smalSchill(n-1-j,C);
break
else
iter_count = n-1-j-C;
An2 = An2 + iter_count*smalSchill(C,C); %iter_count
end
end
Pv2(n) = 1-An2/2^n;
end
The P-values decay as expected to begin with but as n grows, the P-values converge to 1 which is not correct - they should converge to 0.
cheers!
Upvotes: 1
Views: 253
Reputation: 317
I over-complicated everything - and misunderstood the meaning of two of the variables. No wonder my code did work. But i got it to work today
N = 1000 % Length of path investigated
patchL = 1:1:20; % length of continious patch which you are interested in testing.
An = zeros(N+1,length(patchL)); % preallocate memory
for zz = 1:length(patchL)
x = patchL(zz);
for n = 0:N
if n <= x;
An(n+1,zz) = 2^n;
else
An(n+1,zz) = sum(An(n-x:n,zz));
end
end
end
AnDist = An(end,:); % supposing last entry (20) is what you are interested in
Dist = AnDist - [0, AnDist(1:end-1)]; % go from cummulative dist to mdf
Dist = Dist./sum(Dist); % normalizing the mdf
Upvotes: 1