Reality
Reality

Reputation: 613

How to split big numbers?

I have a big number, which I need to split into smaller numbers in Python. I wrote the following code to swap between the two:


def split_number (num, part_size):
    string = str(num)
    string_size = len(string)

    arr = []
    pointer = 0 
    while pointer < string_size:
        e = pointer + part_size
        arr.append(int(string[pointer:e]))
        pointer += part_size
    return arr 

def join_number(arr):
    num = ""
    for x in arr:
        num += str(x)
    return int(num)

But the number comes back different. It's hard to debug because the number is so large so before I go into that I thought I would post it here to see if there is a better way to do it or whether I'm missing something obvious.

Thanks a lot.

Upvotes: 0

Views: 5203

Answers (4)

user97370
user97370

Reputation:

Here's some code for Alex Martelli's answer.

def digits(n, base):
    while n:
        yield n % base
        n //= base

def split_number(n, part_size):
    base = 10 ** part_size
    return list(digits(n, base))

def join_number(digits, part_size):
    base = 10 ** part_size
    return sum(d * (base ** i) for i, d in enumerate(digits))

Upvotes: 0

John La Rooy
John La Rooy

Reputation: 304215

There is no need to convert to and from strings, which can be very time consuming for really large numbers

>>> def split_number(n, part_size):
...     base = 10**part_size
...     L = []
...     while n:
...         n,part = divmod(n,base)
...         L.append(part)
...     return L[::-1]
... 
>>> def join_number(L, part_size):
...     base = 10**part_size
...     n = 0
...     L = L[::-1]
...     while L:
...         n = n*base+L.pop()
...     return n
... 
>>> print split_number(1000005,3)
[1, 0, 5]
>>> print join_number([1,0,5],3)
1000005
>>> 

Here you can see that just converting the number to a str takes longer than my entire function!

>>> from time import time
>>> t=time();b = split_number(2**100000,3000);print time()-t
0.204252004623
>>> t=time();b = split_number(2**100000,30);print time()-t
0.486856222153    
>>> t=time();b = str(2**100000);print time()-t
0.730905056

Upvotes: 2

paxdiablo
paxdiablo

Reputation: 881563

You should think of the following number split into 3-sized chunks:

1000005 -> 100 000 5

You have two problems. The first is that if you put those integers back together, you'll get:

100 0 5 -> 100005

(i.e., the middle one is 0, not 000) which is not what you started with. Second problem is that you're not sure what size the last part should be.

I would ensure that you're first using a string whose length is an exact multiple of the part size so you know exactly how big each part should be:

def split_number (num, part_size):
    string = str(num)
    string_size = len(string)
    while string_size % part_size != 0:
        string = "0%s"%(string)
        string_size = string_size + 1

    arr = []
    pointer = 0
    while pointer < string_size:
        e = pointer + part_size
        arr.append(int(string[pointer:e]))
        pointer += part_size
    return arr

Secondly, make sure that you put the parts back together with the right length for each part (ensuring you don't put leading zeros on the first part of course):

def join_number(arr, part_size):
    fmt_str = "%%s%%0%dd"%(part_size)
    num = arr[0]
    for x in arr[1:]:
        num = fmt_str%(num,int(x))
    return int(num)

Tying it all together, the following complete program:

#!/usr/bin/python

def split_number (num, part_size):
    string = str(num)
    string_size = len(string)
    while string_size % part_size != 0:
        string = "0%s"%(string)
        string_size = string_size + 1

    arr = []
    pointer = 0
    while pointer < string_size:
        e = pointer + part_size
        arr.append(int(string[pointer:e]))
        pointer += part_size
    return arr

def join_number(arr, part_size):
    fmt_str = "%%s%%0%dd"%(part_size)
    num = arr[0]
    for x in arr[1:]:
        num = fmt_str%(num,int(x))
    return int(num)

x = 1000005
print x
y = split_number(x,3)
print y
z = join_number(y,3)
print z

produces the output:

1000005
[1, 0, 5]
1000005

which shows that it goes back together.

Just keep in mind I haven't done Python for a few years. There's almost certainly a more "Pythonic" way to do it with those new-fangled lambdas and things (or whatever Python calls them) but, since your code was of the basic form, I just answered with the minimal changes required to get it working. Oh yeah, and be wary of negative numbers :-)

Upvotes: 1

Alex Martelli
Alex Martelli

Reputation: 881735

Clearly, any leading 0s in the "parts" can't be preserved by this operation. Can't join_number also receive the part_size argument, so that it can reconstruct the string formats with all the leading zeros?

Without some information such as part_size that's known to both the sender and receiver, or the equivalent (such as the base number to use for a similar split and join based on arithmetic, roughly equivalent to 10**part_size given the way you're using part_size), the task becomes quite a bit harder. If the receiver is initially clueless about this, why not just place the part_size (or base, etc) as the very first int in the arr list that's being sent and received? That way, the encoding trivially becomes "self-sufficient", i.e., doesn't need any supplementary parameter known to both sender and receiver.

Upvotes: 2

Related Questions