Dhiral Pandya
Dhiral Pandya

Reputation: 10619

Best way finding the middle point

In many algorithm i have seen people are using two different way to get middle point.

  1. (LOW+HIGH)/2
  2. LOW+(HIGH-LOW)/2

Mostly i have seen 2nd method for example in QuickSort. What is the best way to find middle between two number and why ?

Upvotes: 2

Views: 1757

Answers (4)

Lasse V. Karlsen
Lasse V. Karlsen

Reputation: 391366

It entirely depends on the context, but I'll make a case for case number 2 and explain why.

Let's first assume you go for case number 1, which is this:

(LOW + HIGH) / 2

This looks entirely reasonable, and mathematically, it is. Let's plug in two numbers and look at the results:

(12345 + 56789) / 2

The result is 34567. Looks OK doesn't it?

Now, the problem is that in computers it's not as simple as that. You also got something called data types to contend with. Usually this is denoted with such things as number of bits. In other words, you might have a 32-bit number, or a 16-bit number, or a 64-bit number, and so on.

All of these have what is known as a legal range of values, ie. "what kind of values will these types hold". An 8-bit number, unsigned (which means it cannot be negative) can hold 2 to the power of 8 different values, or 256. A 16-bit unsigned value can hold 65536 values, or a range from 0 to 65535. If the values are signed they go from -half to +half-1, meaning it will go from -128 to +127 for a 8-bit signed value, and from -32768 to +32767 for a signed 16-bit value.

So now we go back to the original formula. What if the data type we're using for the calculation isn't enough to hold LOW + HIGH ?

For instance, let's say we used 16-bit signed values for this, and we still got this expression:

(12345 + 56789) / 2

12345 can be held in a 16-bit value (it is less than 65536), same with 56789, but what about the result? The result of adding 12345 and 56789 is 69134 which is more than 65535 (the highest unsigned 16-bit value).

So what will happen to it? There's two outcomes:

  1. It will overflow, which means it will start back at 0 and count upwards, which means it will actually end up with the result of 3598 (123456 + 56789) - 65536
  2. It will throw an exception or similar, crashing the program with an overflow problem.

If we get the first result, then (12345 + 56789)/2 becomes 3598/2 or 1799. Clearly incorrect.

So, then what if we used the other approach:

12345 + (56789-12345)/2

First, let's do the parenthesis: 56789-12345 is equal to 44444, a number that can be held in a 16-bit data type.

Adding 12345 + 44444 gives us 56789, a number that can also be held in a 16-bit data type.

Dividing 56789 by 2 gives us 28934.5. Since we're probably dealing with "integers" here we get 28934 (typically, unless your particular world rounds up).

So the reason the second expression is chosen above the first is that it doesn't have to handle overflow the same way and is more resilient to this kind of problem.

In fact, if you think about it, the maximum second value you can have is the maximum legal value you can have for your data type, so this kind of expression:

X + (Y-X)

... assuming both X and Y are the same data type, can at most be the maximum of that data type. Basically, it will not have to contend with an overflow at all.

Upvotes: 6

Matt Messersmith
Matt Messersmith

Reputation: 13747

The best way is dependent on what you're trying to accomplish. The first is clearly faster (best for performance), while the second one is used to avoid overflow (best for correctness). So the answer to your question is dependent on your definition of "best".

Upvotes: 1

user555045
user555045

Reputation: 64904

There isn't a best one, but they are clearly different.

  1. (LOW+HIGH)/2 causes trouble if the addition overflows, both in the signed and unsigned case. Doesn't have to be an issue, if you can assume that it never happens it's a perfectly reasonable thing to do.
  2. LOW+(HIGH-LOW)/2 can still cause trouble with overflows if LOW is allowed to be arbitrarily negative. For non-negative input it's fine though. Costs an extra operation.
  3. (LOW+HIGH) >>> 1 as seen in Java. For non-negative but signed input, overflow is non-destructive. The sign bit is used as an extra bit of space that the addition can carry into, the shift then divides the unsigned result by two. Does not work at all if the result is supposed to negative, because by construction the result is non-negative. For array indexes that isn't really a consideration. Doesn't help if you were working with unsigned indexes though.
  4. MOV MID, LOW \ ADD MID, HIGH \ RCR MID in pseudo x86 asm, explicitly uses an extra bit of space and therefore works for all unsigned input no matter what, but cannot be used in most languages.

They all have their upsides and downsides. There is no winner.

Upvotes: 2

olegarch
olegarch

Reputation: 3891

2nd method used, for avoid int overflow during computation. Imagine, you used just 1-byte unsigned integer, so overflow happening if value reached 256. Imagine, we have low=100 and high=200. See computing:

1. (lo + hi) / 2 = (100 + 200) / 2 = 300 / 2; // 300 > 256, int overflow
2. lo + (hi - lo) / 2 = 100 + (200 - 100) / 2 = 150; // no overflow

Upvotes: 2

Related Questions