Reputation: 5157
Assume I have an N×M
matrix A
. I would like to compute the histogram for each column of A
. The naïve way would be to do something like this:
edges = 0:5:100;
counts = zeros(numel(edges) - 1, M);
for i_c = 1:M
counts(:, i_c) = histcounts(A(:, i_c), edges);
end
Is there a better (faster) way to do this?
Edit: Add some performance tests
OK, let's do some testing. First histcounts
+ loop, then an alternative using arrayfun
and an indexing vector, then btmcnellis/randomGuy's solution with cellfun
, finally obchardon's solution using histc
.
It seems like for long columns, histcount
is more efficient. But for shorter but many columns, histc
wins by a great margin!
niter = 10;
M = 100;
N = 10000;
A = rand(M, N);
edges = 0:.05:1;
counts1 = zeros(numel(edges) - 1, N);
counts2 = zeros(numel(edges) - 1, N);
counts3 = zeros(numel(edges) - 1, N);
counts4 = zeros(numel(edges), N);
tic;
for i_r = 1:niter
for i_c = 1:N
counts1(:, i_c) = histcounts(A(:, i_c), edges);
end
end
toc
tic;
for i_r = 1:niter
counts2 = cell2mat(arrayfun(@(ind) histcounts(A(:, ind), edges), 1:size(A, 2), 'UniformOutput', 0)')';
end
toc
tic;
for i_r = 1:niter
Acell = num2cell(A, 1);
counts3 = cell2mat(cellfun(@(column) histcounts(column, edges), Acell, 'UniformOutput', 0)')';
end
toc
tic;
for i_r = 1:niter
counts4 = histc(A, edges, 1);
end
toc
all(counts1(:) == counts2(:))
all(counts1(:) == counts3(:))
counts4 = counts4(1:numel(edges)-1, :); % histc has an extra bin
all(counts1(:) == counts4(:))
Actual tests:
niter = 100;
M = 10000;
N = 100;
Elapsed time is 2.423785 seconds.
Elapsed time is 2.730303 seconds.
Elapsed time is 3.774217 seconds.
Elapsed time is 2.721766 seconds.
niter = 10;
M = 100;
N = 10000;
Elapsed time is 5.438335 seconds.
Elapsed time is 7.387587 seconds.
Elapsed time is 7.647818 seconds.
Elapsed time is 0.276491 seconds.
Upvotes: 1
Views: 748
Reputation: 10792
You can use: histc
x = [0:5:100];
y = histc(A,x, dim);
where dim
is the dimension along which to count.
And then
hist(y(:,1),x);
hist(y(:,2),x);
...
Upvotes: 1
Reputation: 712
You could use num2cell
and cellfun
, although I don't know how this compares with the naive method performance-wise.
num2cell
by default takes a matrix and converts it into a cell array where each cell contains one element of the matrix, but passing a second argument allows you to do so along a particular dimension. So for 2x3 matrix A
, num2cell(A, 1)
will return a 1x3 cell array where each cell contains a 2-by-1 column of A.
cellfun
applies a function to each element of a cell. So in this case you could take a cell array C
output from num2cell
as above and apply histcounts to each column of A
as follows:
counts = cellfun(@(column) histcounts(column, edges), C);
counts
should then be a 3-element array, where the i-th element contains the histcounts
result for the i-th column of A
.
(Note that the @()
syntax above is an anonymous function.)
Upvotes: 0
Reputation: 1797
Split the array A into a cell array where each cell is a single column from the matrix:
Acell = [mat2cell(A',ones(1,M))]';
Apply the function using cellfun
to each cell of the cell array Acell
counts = cellfun(@(x)histcounts(x,edges),Acell);
counts will be a cell array with each cell containing histcounts from the respective column of A.
Upvotes: 0