bacar
bacar

Reputation: 10091

Why is BigDecimal.equals specified to compare both value and scale individually?

This is not a question about how to compare two BigDecimal objects - I know that you can use compareTo instead of equals to do that, since equals is documented as:

Unlike compareTo, this method considers two BigDecimal objects equal only if they are equal in value and scale (thus 2.0 is not equal to 2.00 when compared by this method).

The question is: why has the equals been specified in this seemingly counter-intuitive manner? That is, why is it important to be able to distinguish between 2.0 and 2.00?

It seems likely that there must be a reason for this, since the Comparable documentation, which specifies the compareTo method, states:

It is strongly recommended (though not required) that natural orderings be consistent with equals

I imagine there must be a good reason for ignoring this recommendation.

Upvotes: 63

Views: 23468

Answers (7)

Stuart Marks
Stuart Marks

Reputation: 132390

The general rule for equals is that two equal values should be substitutable for one another. That is, if performing a computation using one value gives some result, substituting an equals value into the same computation should give a result that equals the first result. This applies to objects that are values, such as String, Integer, BigDecimal, etc.

Now consider BigDecimal values 2.0 and 2.00. We know they are numerically equal, and that compareTo on them returns 0. But equals returns false. Why?

Here's an example where they are not substitutable:

var a = new BigDecimal("2.0");
var b = new BigDecimal("2.00");
var three = new BigDecimal(3);

a.divide(three, RoundingMode.HALF_UP)
==> 0.7

b.divide(three, RoundingMode.HALF_UP)
==> 0.67

The results are clearly unequal, so the value of a is not substitutable for b. Therefore, a.equals(b) should be false.

Upvotes: 46

Aleksander Blomskøld
Aleksander Blomskøld

Reputation: 18552

In math, 10.0 equals 10.00. In physics 10.0m and 10.00m are arguably different (different precision), when talking about objects in an OOP, I would definitely say that they are not equal.

It's also easy to think of unexpected functionality if equals ignored the scale (For instance: if a.equals(b), wouldn't you expect a.add(0.1).equals(b.add(0.1)?).

Upvotes: 6

supercat
supercat

Reputation: 81169

The compareTo method knows that trailing zeros do not affect the numeric value represented by a BigDecimal, which is the only aspect compareTo cares about. By contrast, the equals method generally has no way of knowing what aspects of an object someone cares about, and should thus only return true if two objects are equivalent in every way that a programmer might be interested in. If x.equals(y) is true, it would be rather surprising for x.toString().equals(y.toString()) to yield false.

Another issue which is perhaps even more significant is that BigDecimal essentially combines a BigInteger and a scaling factor, such that if two numbers represent the same value but have different numbers of trailing zeroes, one will hold a bigInteger whose value is some power of ten times the other. If equality requires that the mantissa and scale both match, then the hashCode() for BigDecimal can use the hash code of BigInteger. If it's possible for two values to be considered "equal" even though they contain different BigInteger values, however, that will complicate things significantly. A BigDecimal type which used its own backing storage, rather than a BigInteger, could be implemented in a variety of ways to allow numbers to be quickly hashed in such a way that values representing the same number would compare equal (as a simple example, a version which packed nine decimal digits in each long value and always required that the decimal point sit between groups of nine, could compute the hash code in a way that would ignore trailing groups whose value was zero) but a BigDecimal that encapsulates a BigInteger can't do that.

Upvotes: 2

supercat
supercat

Reputation: 81169

A point which has not yet been considered in any of the other answers is that equals is required to be consistent with hashCode, and the cost of a hashCode implementation which was required to yield the same value for 123.0 as for 123.00 (but still do a reasonable job of distinguishing different values) would be much greater than that of a hashCode implementation which was not required to do so. Under the present semantics, hashCode requires a multiply-by-31 and add for each 32 bits of stored value. If hashCode were required to be consistent among values with different precision, it would either have to compute the normalized form of any value (expensive) or else, at minimum, do something like compute the base-999999999 digital root of the value and multiply that, mod 999999999, based upon the precision. The inner loop of such a method would be:

temp = (temp + (mag[i] & LONG_MASK) * scale_factor[i]) % 999999999;

replacing a multiply-by-31 with a 64-bit modulus operation--much more expensive. If one wants a hash table which regards numerically-equivalent BigDecimal values as equivalent, and most keys which are sought in the table will be found, the efficient way to achieve the desired result would be to use a hash table which stores value wrappers, rather than storing values directly. To find a value in the table, start by looking for the value itself. If none is found, normalize the value and look for that. If nothing is found, create an empty wrapper and store an entry under the original and normalized forms of the number.

Looking for something which isn't in the table and hasn't been searched for previously would require an expensive normalization step, but looking for something that has been searched for would be much faster. By contrast, if HashCode needed to return equivalent values for numbers which, because of differing precision, were stored totally differently, that would make all hash table operations much slower.

Upvotes: 12

Matt R
Matt R

Reputation: 10503

I imagine there must be a good reason for ignoring this recommendation.

Maybe not. I propose the simple explanation that the designers of BigDecimal just made a bad design choice.

  1. A good design optimises for the common use case. The majority of the time (>95%), people want to compare two quantities based on mathematical equality. For the minority of the time where you really do care about the two numbers being equal in both scale and value, there could have been an additional method for that purpose.
  2. It goes against people's expectations, and creates a trap that's very easy to fall into. A good API obeys the "principle of least surprise".
  3. It breaks the usual Java convention that Comparable is consistent with equality.

Interestingly, Scala's BigDecimal class (which is implemented using Java's BigDecimal under the hood) has made the opposite choice:

BigDecimal("2.0") == BigDecimal("2.00")     // true

Upvotes: 3

assylias
assylias

Reputation: 328618

If numbers get rounded, it shows the precision of the calculation - in other words:

  • 10.0 could mean that the exact number was between 9.95 and 10.05
  • 10.00 could mean that the exact number was between 9.995 and 10.005

In other words, it is linked to arithmetic precision.

Upvotes: 5

Oliver Charlesworth
Oliver Charlesworth

Reputation: 272517

Because in some situations, an indication of precision (i.e. the margin of error) may be important.

For example, if you're storing measurements made by two physical sensors, perhaps one is 10x more precise than the other. It may be important to represent this fact.

Upvotes: 41

Related Questions