Scanningcrew
Scanningcrew

Reputation: 790

Using the right numeric data type

After becoming more engaged with training new engineers as well as reading Jon Skeet's DevDays presentation I have begun to recognize many engineers aren't clear when to use which numeric datatypes when. I appreciate the role a formal computer science degree plays in helping with this, but I see a lot of new engineers showing uncertainty because they have never worked with large data sets, or financial software, or programming phyiscs or statistics problems, or complex datastore issues.

My experience is that people really grok concepts when they are explained within context. I am looking for good examples of real programming problems where certain data is best represented using data type. Try to stay away from the textbook examples if possible. I am tagging this with Java, but feel free to give examples in other languages and retag:

Integer, Long, Double, Float, BigInteger, etc...

Upvotes: 10

Views: 7592

Answers (5)

Tiny Time
Tiny Time

Reputation: 1

VInt's in Lucene are the devil. The small benefit in size is outweighed hugely by the performance penalty in reading them byte-by-byte.

A good thing to talk about is the space versus time trade off. Saving 200mb was great in 1996, but in 2010, thrashing IO buffers reading a byte at a time is terrible.

Upvotes: 1

Kevin Bourrillion
Kevin Bourrillion

Reputation: 40851

I really don't think you need examples or anything complex. This is simple:

  • Is it a whole number?
    • Can it be > 2^63? BigInteger
    • Can it be > 2^31? long
    • Otherwise int
  • Is it a decimal number?
    • Is an approximate value ok?
      • double
    • Does it need to be exact? (example: monetary amounts!)
      • BigDecimal

(When I say ">", I mean "greater in absolute value", of course.)

I've never used a byte or char to represent a number, and I've never used a short, period. That's in 12 years of Java programming. Float? Meh. If you have a huge array and you are having memory problems, I guess.

Note that BigDecimal is somewhat misnamed; your values do not have to be large at all to need it.

Upvotes: 28

Bugmaster
Bugmaster

Reputation: 1080

One important point you might want to articulate is that it's almost always an error to compare floating-point numbers for equality. For example, the following code is very likely to fail:

double euros = convertToEuros(item.getCostInDollars());
if (euros == 10.0) {
  // this line will most likely never be reached
}

This is one of many reasons why you want to use discrete numbers to represent currency.

When you absolutely must compare floating-point numbers, you can only do so approximately; something to the extent of:

double euros = convertToEuros(item.getCostInDollars());
if (Math.abs(euros - 10.0) < EPSILON) {
  // this might work
}

As for practical examples, my usual rule of thumb is something like this:

  • double: think long and hard before using it; is the pain worth it ?
  • float: don't use it
  • byte: most often used as byte[] to represent some raw binary data
  • int: this is your best friend; use it to represent most stuff
  • long: use this for timestamps and database IDs
  • BigDecimal and BigInteger: if you know about these, chances are you know what you're doing already, so you don't need my advice

I realize that these aren't terribly scientific rules of thumb, but if your target audience are not computer scientists, it might be best to stick to basics.

Upvotes: 4

kar
kar

Reputation: 974

normally numeric if we're talking machine independenat (32/64bit) data type size are as below,

integer: 4 bytes

long: 8 bytes

decimal/float: 4bytes

double : 8bytes

and the sizes reduced to half for signed values (eg: for 4bytes, unsigned=4billions, signed=2billions)

bigInt (depends on language implementation) sometimes up to 10bytes.

for high volumes data archiving (such as search engine) i would highly recommended byte and short to save spaces.

byte: 1 byte, (0-256 unsigned, -128 - 128 signed)

short: 2 byte (65k unsigned)


let's say you want to save record about AGE, since nobody ever lives over 150, so you used data type BYTE (read above for size) but if you use INTEGER you already wasted extra 3bytes and seriously tell me wth live over 4billions yrs.

Upvotes: 1

Kaleb Brasee
Kaleb Brasee

Reputation: 51945

BigDecimal is the best when it comes to maintaining accurate floating point calculations, and being able to specify the desired accuracy. I believe float (and to some extent double) offer performance benefits over BigDecimal, but at the cost of accuracy and usability.

Upvotes: 4

Related Questions