Reputation: 8310
Given a matrix of M rows and N columns, and allocated as a byte array of M*N
elements (these elements are initially set to zero), I would modify this matrix in according to the following rule: the elements that are found in the neighborhood of a certain element must be set to a given value. In other words, given a matrix, I should set a region of the matrix: for this purpose I should access not contiguous portion of the array.
In order to perform the above operation, I have access to the following information:
L*L
of the neighborhood (L is always an odd number).The code that implements this operation should be executed as fast as possible in C++: for this reason I thought of using the above pointer to access different pieces of the array. Instead, the position (row and column) of the central element of the neighborhood could allow me to check whether the specified region exceeds the dimensions of the matrix (for example, the center of the region may be located on the edge of the matrix): in this case I should set only that part of the region that is located in the matrix.
int M = ... // number of matrix rows
int N = ... // number of matrix columns
char* centerPtr = ... // pointer to the center of the region
int i = ... // position of the central element
int j = ... // of the region to be modified
char* tempPtr = centerPtr - (N+1)*L/2;
for(int k=0; k < L; k++)
{
memset(tempPtr,value,N);
tempPtr += N;
}
How can I improve the code? How to handle the fact that one region may exceeds the dimensions of a matrix? How to make the code more efficient with respect to the execution time?
Upvotes: 0
Views: 235
Reputation: 46960
Your code is probably optimal for the general case where the region does not overlap the outside of the matrix. The main efficiency problem you can cause with this kind of code is to make the outer loop over columns instead of rows. This destroys cache and paging performance. You haven't done that.
Using pointers has little or no speed advantage with most modern compilers. Optimizers will come up with very good pointer code from normal array indices. In some cases I've seen array index code run substantially faster than hand-tweaked pointer code for the same thing. So don't use pointer arithmetic if index arithmetic is clearer.
There are 8 boundary cases: north, northwest, west, ..., northeast. Each of these will need a custom version of your loop to touch the right elements. I'll show the northwest case and let you work out the rest.
The fastest possible way to handle the cases is a 3-level "if" tree:
if (j < L/2) { // northwest, west, or southwest
if (i < L/2) {
// northwest
char* tempPtr = centerPtr - (L/2 - i) * N - (L/2 - j);
for(int k = 0; k < L; k++) {
memset(tempPtr, value, L - j);
tempPtr += N;
}
} else if (i >= M - L/2) {
// southwest
} else {
// west
}
} else if (j >= N - L/2) { // symmetrical cases for east.
if (i < L/2) {
// northeast
} else if (i >= M - L/2) {
// southeast
} else {
// east
}
} else {
if (i < L/2) {
// north
} else if (i >= M - L/2) {
// south
} else {
// no overlap
}
}
It's tedious to do it like this, but you'll have no more than 3 comparisons per region.
Upvotes: 1