Reputation: 43
There is a four-dimensional matrix data, along with corresponding X coordinates data_X and Y coordinates data_Y for each element. I want to perform a two-dimensional interval segmentation, or grid partitioning, based on edge_X and edge_Y, resulting in a new array data_class. The current loop implementation is very slow. How can I improve the computational speed?
clc;clear
rng("default")
data = randi([0, 100], 40,30,20, 10);
data_X = randi([0, 100], 40,30,20, 10);
data_Y = randi([0, 100], 40,30,20, 10);
% Define the edges for histograms in X and Y directions
edge_X = [0:10:100];
edge_Y = [0:20:100];
sz = size(data);
for ix = 1:sz(1)
for iy = 1:sz(2)
for ii = 1:sz(3)
for ij = 1:sz(4)
% Extract the current data, X, and Y values
g_data = data(ix, iy, ii, ij);
g_X = data_X(ix, iy, ii, ij);
g_Y = data_Y(ix, iy, ii, ij);
% Compute the 2D histogram count
g_id = histcounts2(g_X, g_Y, edge_X, edge_Y);
g_new = g_id * g_data;
data_class (ix,iy,ii,ij,:,:) = g_new;
end
end
end
end
Since each element of data is processed individually, g_data will only be noted as 1 in a bin, and then
g_new = g_id * g_data;
g_new is the position of g_data after reclassification
The current loop implementation is very slow. How can I improve the computational speed?
Upvotes: 1
Views: 39
Reputation: 30101
The first thing you should be doing is looking at the orange underlined warning in your editor about data_class
growing with each loop iteration. Initialising it before all of your loops with this saves a large portion of the runtime immediately
Pre-allocate data_class
before the loops:
szout = [sz, [numel(edge_X),numel(edge_Y)]-1];
data_class = NaN(szout);
But the main bottleneck is calling histcounts2
on every value individually, you can make large time savings by bringing it outside of the ii/ij
loops, and instead of using the first input as a count of 1
for the bin it's in, use the 4th and 5th outputs which store the bin each value should go into. In your case this is the row and column indices.
Here are some timings:
data_class
: 7.75sec (-62% total runtime)histcounts2
: 0.53sec (-97% total runtime)De-nest histcounts2
sz = size(data);
szout = [sz, [numel(edge_X),numel(edge_Y)]-1];
data_class = NaN(szout);
for ix = 1:sz(1)
for iy = 1:sz(2)
g_X = data_X(ix, iy, :, :);
g_Y = data_Y(ix, iy, :, :);
[~,~,~,bx,by] = histcounts2(g_X, g_Y, edge_X, edge_Y);
bx = squeeze(bx);
by = squeeze(by);
g_data = squeeze( data(ix, iy, :, :) );
for ii = 1:sz(3)
for ij = 1:sz(4)
g_new = zeros(szout(end-1:end));
g_new(bx(ii,ij),by(ii,ij)) = g_data(ii,ij);
data_class(ix,iy,ii,ij,:,:) = g_new;
end
end
end
end
Further, you don't need to initialise the array of zeros g_new
for every slice just to set one value. Instead, initialise data_class
as a matrix of zeros and just override the specific values you're interested in:
Only assign non-zero values
sz = size(data);
szout = [sz, [numel(edge_X),numel(edge_Y)]-1];
data_class = zeros(szout);
for ix = 1:sz(1)
for iy = 1:sz(2)
g_X = data_X(ix, iy, :, :);
g_Y = data_Y(ix, iy, :, :);
[~,~,~,bx,by] = histcounts2(g_X, g_Y, edge_X, edge_Y);
bx = squeeze(bx);
by = squeeze(by);
g_data = squeeze( data(ix, iy, :, :) );
for ii = 1:sz(3)
for ij = 1:sz(4)
data_class(ix,iy,ii,ij,bx(ii,ij),by(ii,ij)) = g_data(ii,ij);
end
end
end
end
You could do something with sub2ind
to get rid of the ii/ij
loops entirely, but in a quick test that didn't show any performance over just using the loops.
Upvotes: 1