eold
eold

Reputation: 6052

Safe integer middle value formula

I am looking for an efficient formula working in Java which calculates the following expression:

(low + high) / 2

which is used for binary search. So far, I have been using "low + (high - low) / 2" and "high - (high - low) / 2" to avoid overflow and underflows in some cases, but not both. Now I am looking for an efficient way to do this, which would for for any integer (assuming integers range from -MAX_INT - 1 to MAX_INT).

UPDATE: Combining the answers from Jander and Peter G. and experimenting a while I got the following formulas for middle value element and its immediate neighbors:

Lowest-midpoint (equal to floor((low + high)/2), e.g. [2 3] -> 2, [2 4] -> 3, [-3 -2] -> -3)

mid = (low & high) + ((low ^ high) >> 1);

Highest-midpoint (equal to ceil((low + high)/2), e.g. [2 3] -> 3, [2 4] -> 3, [-3 -2] -> -2)

low++;
mid = (low & high) + ((low ^ high) >> 1);

Before-midpoint (equal to floor((low + high - 1)/2)), e.g. [2 3] -> 2, [2 4] -> 2, [-7 -3] -> -6)

high--;
mid = (low & high) + ((low ^ high) >> 1);

After-midpoint (equal to ceil((low + high + 1)/2)), e.g. [2 3] -> 3, [2 4] -> 4, [-7 -3] -> -4)

mid = (low & high) + ((low ^ high) >> 1) + 1;

Or, without bitwise and (&) and or (|), slightly slower code (x >> 1 can be replaced with floor(x / 2) to obtain bitwise operator free formulas):

Leftmost-midpoint

halfLow = (low >> 1), halfHigh = (high >> 1);
mid = halfLow + halfHigh + ((low-2*halfLow + high-2*halfHigh) >> 1);

Rightmost-midpoint

low++
halfLow = (low >> 1), halfHigh = (high >> 1);
mid = halfLow + halfHigh + ((low-2*halfLow + high-2*halfHigh) >> 1);

Before-midpoint

high--;
halfLow = (low >> 1), halfHigh = (high >> 1);
mid = halfLow + halfHigh + ((low-2*halfLow + high-2*halfHigh) >> 1);

After-midpoint

halfLow = (low >> 1), halfHigh = (high >> 1);
mid = halfLow + halfHigh + ((low-2*halfLow + high-2*halfHigh) >> 1) + 1;

Note: the above >> operator is considered to be signed shift.

Upvotes: 9

Views: 4473

Answers (4)

Jander
Jander

Reputation: 5667

From http://aggregate.org/MAGIC/#Average%20of%20Integers:

(low & high) + ((low ^ high) / 2)

is an overflow-proof average of two unsigned integers.

Now, this trick only works on unsigned integers. But because ((a+x) + (b+x))/2 = (a+b)/2 + x, you can fudge it as follows, if you have unsigned integers with the same bit size as your signed integers:

unsigned int u_low  = low + MAX_INT + 1;
unsigned int u_high = high + MAX_INT + 1;
unsigned int u_avg  = (u_low & u_high) + (u_low ^ u_high)/2;
int avg = u_avg - MAX_INT - 1;

UPDATE: On further thought, this will work even if you don't have signed integers. Signed and unsigned integers are equivalent over addition, subtraction, and bitwise operations. So all we need to worry about is making sure that divide acts like an unsigned divide, which we can do by using a shift and masking out the uppermost bit.

low += MAX_INT + 1;
high += MAX_INT + 1;
avg = (low & high) + (((low ^ high) >> 1) & MAX_INT);
avg -= MAX_INT + 1;

(Note that if you're using Java, you can use an unsigned shift, ... >>> 1, instead of (... >> 1) & MAX_INT.)

HOWEVER, there's an alternative I stumbled upon that's even simpler, and I haven't yet figured out how it works. There's no need to adjust the numbers by MAX_INT or use unsigned variables or anything. It's simply:

avg = (low & high) + ((low ^ high) >> 1);

Tested with all combinations of 16-bit signed integers low and high in the range -32768..32767, but not yet proven outright (by me anyway).

Upvotes: 9

Lorenzo Castelli
Lorenzo Castelli

Reputation: 21

Assuming high >= low, a variant of your initial approach should also work, that is:

low + ((high - low) >>> 1)

where >>> is an unsigned shift (as in Java).

The idea is that high - low never overflows if the result is interpreted as an unsigned integer, so the unsigned shift correctly performs division by 2 and the formula computes the middle value.

Upvotes: 2

Peter G.
Peter G.

Reputation: 15144

int half_low = low/2;
int lsb_low = low - 2*half_low;
int half_high = high/2;
int lsb_high = high - 2*half_high;
int mean = half_low + half_high + (lsb_low + lsb_high)/2;

Upvotes: 1

Mormegil
Mormegil

Reputation: 8071

Note that none of your ideas works for low=-MAX_INT-1, high=MAX_INT. The best I could come with is something like low/2 + high/2 + ((low & 1) + (high & 1))/2.

Upvotes: 0

Related Questions