Reputation: 30467
which algorithm to use to scale down 32Bit RGB IMAGE to custom resolution? Algorithm should average pixels.
for example If I have 100x100 image and I want new Image of size 20x50. Avg of first five pixels of first source row will give first pixel of dest, And avg of first two pixels of first source column will give first dest column pixel.
Currently what I do is first scale down in X resolution, and after that I scale down in Y resolution. I need one temp buffer in this method.
Is there any optimized method that you know?
Upvotes: 2
Views: 5990
Reputation: 189
This is what you are looking for in C. It is Egons approach implemented in C and optimized for speed. Alpha channel is ignored and set to 0, but this can be easily changed. Wrapping the two inner loops in a Duffs-Loop is only for performance - the Duffs-Loops can be replaced by a normal for-loop if desired.
Parameters: dst and src are pointers to the 32-bit pixel data, dst_pitch and src_pitch are the lengths of one scanline in bytes, src_width and src_height are the width and height of the source image in pixels, factor_x and factor_y are the scaling denominators in x- and y-directions.
Returns 0 on success and -1 on failure.
#define DUFFS_LOOP(pixel_copy_increment, width) \
{ int n = (width+7)/8; \
switch (width & 7) { \
case 0: do { pixel_copy_increment; \
case 7: pixel_copy_increment; \
case 6: pixel_copy_increment; \
case 5: pixel_copy_increment; \
case 4: pixel_copy_increment; \
case 3: pixel_copy_increment; \
case 2: pixel_copy_increment; \
case 1: pixel_copy_increment; \
} while ( --n > 0 ); \
} \
}
int fastscale(unsigned char *dst, int dst_pitch, unsigned char *src, int src_width, int src_height, int src_pitch, int factor_x, int factor_y)
{
if (factor_x < 1 || factor_y < 1) return -1;
int temp_r, temp_g, temp_b;
int i1,i2;
int dst_width = src_width / factor_x;
int dst_height = src_height / factor_y;
if (!dst_height || !dst_width) return -1;
int factors_mul = factor_x * factor_y;
int factorx_mul4 = factor_x << 2;
int src_skip1 = src_pitch - factorx_mul4;
int src_skip2 = factorx_mul4 - factor_y * src_pitch;
int src_skip3 = src_pitch * factor_y - dst_width * factorx_mul4;
int dst_skip = dst_pitch - (dst_width << 2);
for (i1 = 0; i1 < dst_height; ++i1)
{
for (i2 = 0; i2 < dst_width; ++i2)
{
temp_r = temp_g = temp_b = 0;
DUFFS_LOOP ({
DUFFS_LOOP ({
src++; // alpha
temp_r += *(src++);
temp_g += *(src++);
temp_b += *(src++);
}, factor_x);
src += src_skip1;
}, factor_y);
*(dst++) = 0; // alpha
*(dst++) = temp_r / factors_mul;
*(dst++) = temp_g / factors_mul;
*(dst++) = temp_b / factors_mul;
src += src_skip2;
}
dst += dst_skip;
src += src_skip3;
}
return 0;
}
Upvotes: 0
Reputation: 86196
It really is a speed/quality trade-off.
First of all, you're correct that doing one dimension then the other is slower than it has to be. Way too many memory reads and writes.
Your big choice is whether to support fractional pixels or not. Your example is 100x100 to 20x50. So 10 pixels map to 1. What if you're going from 100x100 to 21x49? Are you willing to operate at source pixel boundaries, or do you want to pull fractional pixels in? What would you do for 100x100 to 99x99?
You have to tell us what you're willing to accept before we can say what's fastest.
And also tell us the possible extremes of the shrinkage. How many orders of magnitude might the difference between the source and destination be? At some point, sampling representative pixels inside the source are won't be much worse than averaging all the pixels. But you'll have to be careful in choosing representative pixels or you'll get aliasing with many common patterns.
Upvotes: 1
Reputation: 1543
After you do the standard C optimizations (pointer arithmetic, fixed point math, etc...) There are also some more clever optimizations to be had. A (very) long time ago, I saw an scaler implementation that scaled the X direction first. In the process of writing out the horizontally scaled image, it rotated the image 90degrees in memory. This was so that when it came time to do the reads for the Y direction scale, the data in memory would be better cache aligned.
This technique depends heavily on the processor that it will run on.
Upvotes: 3
Reputation: 1705
This averages the appropriate pixels.
w_ratio = src.w / dest.w
h_ratio = src.h / dest.h
dest[x,y] =
AVG( src[x * w_ratio + xi, y * h_ratio + yi] )
where
xi in range (0, w_ratio - 1), inc by 1
yi in range (0, h_ratio - 1), inc by 1
For boundary conditions do a separate loop (no if's in loop).
Here's a more C like code:
src and dest are bitmaps that:
* property src[x,y] for pixel
* property src.w for width
* property src.h for height
pixel has been defined so that
adding
p1 = p1 + p2
is same as
p1.r = p1.r + p2.r
p1.g = p1.g + p2.g
...
division
p1 = p1 / c
p1.r = p1.r / c
p1.g = p1.g / c
evaluation with a constant 0
p1 = 0
p1.r = 0
p1.g = 0
...
for simplicity sake I won't consider the problem when pixel component integer overflows...
float w_ratio = src.w / dest.w;
float h_ratio = src.h / dest.h;
int w_ratio_i = floor(w_ratio);
int h_ratio_i = floor(h_ratio);
wxh = w_ratio*h_ratio;
for (y = 0; y < dest.w; y++)
for (x = 0; x < dest.h; x++){
pixel temp = 0;
int srcx, srcy;
// we have to use here the floating point value w_ratio, h_ratio
// otherwise towards the end it can get a little wrong
// this multiplication can be optimized similarly to Bresenham's line
srcx = floor(x * w_ratio);
srcy = floor(y * h_ratio);
// here we use floored value otherwise it might overflow src bitmap
for(yi = 0; yi < h_ratio_i; yi++)
for(xi = 0; xi < w_ratio_i; xi++)
temp += src[srcx + xi, srcy + yi];
dest[x,y] = temp / wxh;
}
Upvotes: 2
Reputation:
The term you are looking for is "Resampling." In your case you want image resampling. You seem to already be doing linear interpolation, which should be the fastest. Here are ~6 base algorithms. If you really want to delve into the subject look into "resampling kernels."
Upvotes: 6
Reputation: 308138
What you're doing is the optimized method. The only faster one is called nearest neighbor, where you simply grab the middle pixel of the range without trying to average any of them. The quality is significantly worse if there is any detail in the original image, although it might be acceptable if the original is simple.
Upvotes: 0
Reputation: 52341
You forget to mention the most important aspect of the question: how much you care about quality. If you dont care exactly how the values of the sources pixels are smashed together to create the destination pixel the fastest is (at least in almost all cases) the one that produces the worst quality.
If youre tempted to respond with "the fastest algorithm that still yields very good quality" you have essentially covered the entire algorithm field that deals with just imagesampling/resizing.
And you already outlined your initial idea of the algorithm:
Avg of first five pixels of first source row will give first pixel of dest,
Calculating the average value for each channel on the source pixels could be seen as trivial, are you looking for example code that does that?
Or are you looking for someone to challenge your initial draft of the algorithm with something even faster?
Upvotes: 2
Reputation: 37494
If you're looking for a wordy explanation, I've found this article to be helpful. If on the other hand you deal more in mathematical formulae, there is a method of fast image downscaling explained here.
Upvotes: 1