David Doria
David Doria

Reputation: 10273

Explicit prefetching of non-contiguous data

I do a lot of operations on sub-regions of images. For example, If I have a 100x100 image, I might want to iterate over this image and process blocks of 10x10 pixels. For example:

for(each 10x10 block)
{
  for(each pixel in the block)
  {
    do something
  }
}

The problem with this is that the small blocks are not contiguous chunks of memory (i.e. the image pixels are stored in row major order, so when I access the 10x10 block, the pixels in each row of the block are contiguous, but the rows of the block are not contiguous. Is there anything that can be done to speed up the access to the pixels in these blocks? Or is it just impossible to get fast access to a region of a data structure like this?

From a lot of reading I did, it sounded like something like first reading the pixels as the only operation in a loop might be useful:

// First read the pixels
vector<float> vals(numPixels);
for(pixels in first row)
{
val[i] = pixels[i];
}

// Then do the operations on the pixels
for(elements of vals)
{
 doSomething(vals[i])
}

versus what I'm doing which is both simultaneously just:

// Read and operate on the pixels
for(pixels in first row)
{
 doSomething(pixels[i])
}

but I was unable to find any actual code examples (versus theoretical explanation) of how to do this. Is there any truth to this?

Upvotes: 1

Views: 425

Answers (1)

MvG
MvG

Reputation: 60958

gcc has a builtin functioncalled __builtin_prefetch. You can pass an address to that function, and on targets that support it, gcc will emit a machine instruction causing that address to be loaded into cache even though it isn't used immediately.

Many modern image-processing applications store images in tiles, as opposed to the rows (a.k.a. *scanlines) you describe. E.g. GIMP does that. So if you have control over the way the image is stored, then using a tiled approach will likely increase locality and therefore reduce cache misses and improve performance.

Upvotes: 1

Related Questions