Alex Gaynor
Alex Gaynor

Reputation: 15019

How to handle arbitrarily large integers

I'm working on a programming language, and today I got the point where I could compile the factorial function(recursive), however due to the maximum size of an integer the largest I can get is factorial(12). What are some techniques for handling integers of an arbitrary maximum size. The language currently works by translating code to C++.

Upvotes: 14

Views: 22701

Answers (7)

Bill K
Bill K

Reputation: 62769

If you're building unlimited size decimal math into a language (for learning purposes I'd guess) with today's gigantic memory space, you should just use a byte array where each byte just holds a digit (0-9). You then write your own routine to add, subtract multiply and divide your byte arrays.

If you're implementing it yourself, the algorithms you use could simply echo the way you'd do the math as a human. For addition just start at the right side and add each position to make a new digit and deal with the carry, etc.

I can give you some Java-like psuedocode but can't really do C++ from scratch at this point:

class BigAssNumber {
    private byte[] value;

    // This constructor can handle numbers where 
    // overflows have occurred. for instance:
    // add [5][7] and [8] to come up with [5][15]
    // then let this constructor change that to [6][5]
    public BigAssNumber(byte[] value) {
        this.value=normalize(value);
    }

    // Adds two numbers and returns the sum.  Originals not changed.
    public BigAssNumber add(BigAssNumber other) {
        // This needs to be a byte by byte copy in newly allocated space, not pointer copy!
        byte[] dest = value.length > other.length ? value : other.value;         

        // Just add each pair of numbers, like in a pencil and paper addition problem.
        for(int i=0; i<min(value.length, other.value.length); i++)
            dest[i]=value[i]+other.value[i];

        // constructor will fix overflows.
        return new BigAssNumber(dest);
    }

    // Fix things that might have overflowed  0,17,22 will turn into 1,9,2        
    private byte[] normalize(byte [] value) {
        if (most significant digit of value is not zero)
            extend the byte array by a few zero bytes in the front (MSB) position.

        // Simple cheap adjust.  Could lose inner loop easily if It mattered.
        for(int i=0;i<value.length;i++)
            while(value[i] > 9) {
                value[i] -=10;
                value[i+1] +=1;
            }
        }
    }
}

I use the fact that we have a lot of extra room in a byte to help deal with addition overflows in a generic way. Can work for subtraction too (your bytes can be signed so that [4][3] - [7] = [4][-4], and normalize that to [3][6].

I don't deal with negative BigAssIntegers here, but you could store a sign flag in the class. You could also store a decimal point location, but at that point you're totally replicating BCD style libraries that would be much more performant.

Upvotes: 4

artificialidiot
artificialidiot

Reputation: 5369

If I were implement my own language and want to support arbitrary length numbers, I will use a target language with the carry/borrow concept. But since there is no HLL that implements this without severe performance implications (like exceptions), I will certainly go implement it in assembly. It will probably take a single instruction (as in JC in x86) to check for overflow and handle it (as in ADC in x86), which is an acceptable compromise for a language implementing arbitrary precision. Then I will use a few functions written in assembly instead of regular operators, if you can utilize overloading for a more elegant output, even better. But I don't expect generated C++ to be maintainable (or meant to be maintained) as a target language.

Or, just use a library which has more bells and whistles than you need and use it for all your numbers.

As a hybrid approach, detect overflow in assembly and call the library function if overflow instead of rolling your own mini library.

Upvotes: 0

John D. Cook
John D. Cook

Reputation: 30089

If you want to roll your own arbitrary precision library, see Knuth's Seminumerical Algorithms, volume 2 of his magnum opus.

Upvotes: 5

Alex Gaynor
Alex Gaynor

Reputation: 15019

My prefered approach would be to use my current int type for 32-bit ints(or maybe change it to internally to be a long long or some such, so long as it can continue to use the same algorithms), then when it overflows, have it change to storing as a bignum, whether of my own creation, or using an external library. However, I feel like I'd need to be checking for overflow on every single arithmetic operation, roughly 2x overhead on arithmetic ops. How could I solve that?

Upvotes: 0

Clayton
Clayton

Reputation: 918

Other posters have given links to libraries that will do this for you, but it seem like you're trying to build this into your language. My first thought is: are you sure you need to do that? Most languages would use an add-on library as others have suggested.

Assuming you're writing a compiler and you do need this feature, you could implement integer arithmetic functions for arbitrarily large values in assembly.

For example, a simple (but non-optimal) implementation would represent the numbers as Binary Coded Decimal. The arithmetic functions could use the same algorithms as you'd use if you were doing the math with pencil and paper.

Also, consider using a specialized data type for these large integers. That way "normal" integers can use the standard 32 bit arithmetic.

Upvotes: 0

Adam Rosenfield
Adam Rosenfield

Reputation: 400324

There's no easy way to do it in C++. You'll have to use an external library such as GNU Multiprecision, or use a different language which natively supports arbitrarily large integers such as Python.

Upvotes: 1

Barry Kelly
Barry Kelly

Reputation: 42152

If you need larger than 32-bits you could consider using 64-bit integers (long long), or use or write an arbitrary precision math library, e.g. GNU MP.

Upvotes: 19

Related Questions