zeeMonkeez
zeeMonkeez

Reputation: 5157

Histogram along one dimension in MATLAB

Assume I have an N×M matrix A. I would like to compute the histogram for each column of A. The naïve way would be to do something like this:

edges = 0:5:100;
counts = zeros(numel(edges) - 1, M);
for i_c = 1:M
  counts(:, i_c) = histcounts(A(:, i_c), edges);
end

Is there a better (faster) way to do this?

Edit: Add some performance tests

OK, let's do some testing. First histcounts + loop, then an alternative using arrayfun and an indexing vector, then btmcnellis/randomGuy's solution with cellfun, finally obchardon's solution using histc. It seems like for long columns, histcount is more efficient. But for shorter but many columns, histc wins by a great margin!

niter = 10;

M = 100;
N = 10000;

A = rand(M, N);
edges = 0:.05:1;


counts1 = zeros(numel(edges) - 1, N);
counts2 = zeros(numel(edges) - 1, N);
counts3 = zeros(numel(edges) - 1, N);
counts4 = zeros(numel(edges), N);

tic;
for i_r = 1:niter
    for i_c = 1:N
        counts1(:, i_c) = histcounts(A(:, i_c), edges);
    end
end
toc

tic;
for i_r = 1:niter
    counts2 = cell2mat(arrayfun(@(ind) histcounts(A(:, ind), edges), 1:size(A, 2), 'UniformOutput', 0)')';
end
toc

tic;
for i_r = 1:niter
    Acell = num2cell(A, 1);
    counts3 = cell2mat(cellfun(@(column) histcounts(column, edges), Acell, 'UniformOutput', 0)')';
end
toc

tic;
for i_r = 1:niter
    counts4 = histc(A, edges, 1);
end
toc

all(counts1(:) == counts2(:))
all(counts1(:) == counts3(:))
counts4 = counts4(1:numel(edges)-1, :); % histc has an extra bin
all(counts1(:) == counts4(:))

Actual tests:

niter = 100; 
M = 10000;
N = 100;

Elapsed time is 2.423785 seconds.
Elapsed time is 2.730303 seconds.
Elapsed time is 3.774217 seconds.
Elapsed time is 2.721766 seconds.

niter = 10;
M = 100;
N = 10000;

Elapsed time is 5.438335 seconds.
Elapsed time is 7.387587 seconds.
Elapsed time is 7.647818 seconds.
Elapsed time is 0.276491 seconds.

Upvotes: 1

Views: 748

Answers (3)

obchardon
obchardon

Reputation: 10792

You can use: histc

x = [0:5:100];
y = histc(A,x, dim); 

where dim is the dimension along which to count.

And then

hist(y(:,1),x);
hist(y(:,2),x);
...

Upvotes: 1

btmcnellis
btmcnellis

Reputation: 712

You could use num2cell and cellfun, although I don't know how this compares with the naive method performance-wise.

num2cell by default takes a matrix and converts it into a cell array where each cell contains one element of the matrix, but passing a second argument allows you to do so along a particular dimension. So for 2x3 matrix A, num2cell(A, 1) will return a 1x3 cell array where each cell contains a 2-by-1 column of A.

cellfun applies a function to each element of a cell. So in this case you could take a cell array C output from num2cell as above and apply histcounts to each column of A as follows:

counts = cellfun(@(column) histcounts(column, edges), C);

counts should then be a 3-element array, where the i-th element contains the histcounts result for the i-th column of A.

(Note that the @() syntax above is an anonymous function.)

Upvotes: 0

Some Guy
Some Guy

Reputation: 1797

Split the array A into a cell array where each cell is a single column from the matrix:

Acell = [mat2cell(A',ones(1,M))]';

Apply the function using cellfun to each cell of the cell array Acell

counts = cellfun(@(x)histcounts(x,edges),Acell);

counts will be a cell array with each cell containing histcounts from the respective column of A.

Upvotes: 0

Related Questions