Marius
Marius

Reputation: 661

How to do a median projection of a large image stack in Matlab

I have a large stack of 800 16bit gray scale images with 2048x2048px. They are read from a single BigTIFF file and the whole stack barely fits into my RAM (8GB). Now I need do a median projection. That means I want to compute the median of each pixel across all 800 frames. The Matlab median function fails because there is not enough memory left make a copy of the whole array for the function call. What would be an efficient way to compute the median?

I have tried using a for loop to compute the median one pixel at a time, but this is still terribly slow.

Upvotes: 5

Views: 607

Answers (4)

Olivier
Olivier

Reputation: 921

If you have access to the Image Processing Toolbox, Matlab has a set of tool to handle large images called Blockproc

From the docs :

To avoid these problems, you can process large images incrementally: reading, processing, and finally writing the results back to disk, one region at a time. The blockproc function helps you with this process.

Upvotes: 1

Francesco Callari
Francesco Callari

Reputation: 11795

I bet that the distributions of the individual pixel values over the stack (i.e. the histograms of the pixel jets) are sparse.

If that's the case, the amount of memory needed to keep all the pixel histograms is much less than 2K x 2K x 64k: you can use a compact hash map to represent each histogram, and update them loading the images one at a time. When all updates are done, you go through your histograms and compute the median of each.

Upvotes: 1

Yvon
Yvon

Reputation: 2993

I will try my best to provide help (if any), because I don't have an 800-stack TIFF image, nor an 8GB computer, but I want to see if my thinkings can form a solution.

  • First, 800*2048*2048*8bit = 3.2GB, not including the headers. With your 8GB RAM it should not be too difficult to store it at once; there might be too many programs running and chopping up the contiguous memories. Anyway, let's treat the problem as Matlab can't load it as a whole into the memory.

  • As Jonas suggests, imread supports loading a TIFF image by index. It also supports a PixelRegion parameter, so you can also consider accessing parts of the image by this parameter if you want to utilize Shai's idea.

  • I came up with a median algo that doesn't use all the data at the same time; it barely scans through a sequence of un-ordered data, one at each time; but it does keep a memory of 256 counters.

_

data = randi([0,255], 1, 800);
bins = num2cell(zeros(256,1,'uint16'));
for ii = 1:800
    bins{data(ii)+1} = bins{data(ii)+1} + 1;
end
% clearvars data
s = cumsum(cell2mat(bins));
if find(s==400)
    med = ( find(s==400, 1, 'first') + ...
        find(s>400, 1, 'first') ) /2 - 1;
else
    med = find(s>400, 1, 'first') - 1;
end

_

It's not very efficient, at least because it uses a for loop. But the benefit is instead of keeping 800 raw data in memory, only 256 counters are kept; but the counters need uint16, so actually they are roughly equivalent to 512 raw data. But if you are confident that for any pixel the same grayscale level won't count for more than 255 times among the 800 samples, you can choose uint8, and hence reduce the memory by half.

The above code is for one pixel. I'm still thinking how to expand it to a 2048x2048 version, such as

for ii = 1:800
    img_data = randi([0,255], 2048, 2048);
    (do stats stuff)
end

By doing so, for each iteration, you only need these kept in memory: One frame of image; A set of counters; A few supplemental variables, with size comparable to one frame of image.

  • I use a cell array to store the counters. According to this post, a cell array can be pre-allocated while its elements can still be stored in memory non-contigously. That means the 256 counters (512*2048*2048 bytes) can be stored separately, which is quite reasonable for your 8GB RAM. But obviously my sample code does not make use of it since bins = num2cell(zeros(....

Upvotes: 0

Jonas
Jonas

Reputation: 74940

Iterating over blocks, as @Shai suggests, may be the most straightforward solution. If you do have this problem frequently, you may want to consider converting the image to a mat-file, so that you can access the pixels as n-d array directly from disk.

%# convert to mat file
matObj = matfile('dest.mat','w');
matObj.data(2048,2048,numSlices) = 0; 
for t = 1:numSlices
    matObj.data(:,:,t) = imread(tiffFile,'index',t);
end

%# load a block of the matfile to take median (run as part of a loop)
medianOfBlock = median(matObj.data(1:128,1:128,:),3);

Upvotes: 4

Related Questions