Reputation: 613
I have a big number, which I need to split into smaller numbers in Python. I wrote the following code to swap between the two:
def split_number (num, part_size):
string = str(num)
string_size = len(string)
arr = []
pointer = 0
while pointer < string_size:
e = pointer + part_size
arr.append(int(string[pointer:e]))
pointer += part_size
return arr
def join_number(arr):
num = ""
for x in arr:
num += str(x)
return int(num)
But the number comes back different. It's hard to debug because the number is so large so before I go into that I thought I would post it here to see if there is a better way to do it or whether I'm missing something obvious.
Thanks a lot.
Upvotes: 0
Views: 5203
Reputation:
Here's some code for Alex Martelli's answer.
def digits(n, base):
while n:
yield n % base
n //= base
def split_number(n, part_size):
base = 10 ** part_size
return list(digits(n, base))
def join_number(digits, part_size):
base = 10 ** part_size
return sum(d * (base ** i) for i, d in enumerate(digits))
Upvotes: 0
Reputation: 304215
There is no need to convert to and from strings, which can be very time consuming for really large numbers
>>> def split_number(n, part_size):
... base = 10**part_size
... L = []
... while n:
... n,part = divmod(n,base)
... L.append(part)
... return L[::-1]
...
>>> def join_number(L, part_size):
... base = 10**part_size
... n = 0
... L = L[::-1]
... while L:
... n = n*base+L.pop()
... return n
...
>>> print split_number(1000005,3)
[1, 0, 5]
>>> print join_number([1,0,5],3)
1000005
>>>
Here you can see that just converting the number to a str
takes longer than my entire function!
>>> from time import time
>>> t=time();b = split_number(2**100000,3000);print time()-t
0.204252004623
>>> t=time();b = split_number(2**100000,30);print time()-t
0.486856222153
>>> t=time();b = str(2**100000);print time()-t
0.730905056
Upvotes: 2
Reputation: 881563
You should think of the following number split into 3-sized chunks:
1000005 -> 100 000 5
You have two problems. The first is that if you put those integers back together, you'll get:
100 0 5 -> 100005
(i.e., the middle one is 0, not 000) which is not what you started with. Second problem is that you're not sure what size the last part should be.
I would ensure that you're first using a string whose length is an exact multiple of the part size so you know exactly how big each part should be:
def split_number (num, part_size):
string = str(num)
string_size = len(string)
while string_size % part_size != 0:
string = "0%s"%(string)
string_size = string_size + 1
arr = []
pointer = 0
while pointer < string_size:
e = pointer + part_size
arr.append(int(string[pointer:e]))
pointer += part_size
return arr
Secondly, make sure that you put the parts back together with the right length for each part (ensuring you don't put leading zeros on the first part of course):
def join_number(arr, part_size):
fmt_str = "%%s%%0%dd"%(part_size)
num = arr[0]
for x in arr[1:]:
num = fmt_str%(num,int(x))
return int(num)
Tying it all together, the following complete program:
#!/usr/bin/python
def split_number (num, part_size):
string = str(num)
string_size = len(string)
while string_size % part_size != 0:
string = "0%s"%(string)
string_size = string_size + 1
arr = []
pointer = 0
while pointer < string_size:
e = pointer + part_size
arr.append(int(string[pointer:e]))
pointer += part_size
return arr
def join_number(arr, part_size):
fmt_str = "%%s%%0%dd"%(part_size)
num = arr[0]
for x in arr[1:]:
num = fmt_str%(num,int(x))
return int(num)
x = 1000005
print x
y = split_number(x,3)
print y
z = join_number(y,3)
print z
produces the output:
1000005
[1, 0, 5]
1000005
which shows that it goes back together.
Just keep in mind I haven't done Python for a few years. There's almost certainly a more "Pythonic" way to do it with those new-fangled lambdas and things (or whatever Python calls them) but, since your code was of the basic form, I just answered with the minimal changes required to get it working. Oh yeah, and be wary of negative numbers :-)
Upvotes: 1
Reputation: 881735
Clearly, any leading 0
s in the "parts" can't be preserved by this operation. Can't join_number
also receive the part_size
argument, so that it can reconstruct the string formats with all the leading zeros?
Without some information such as part_size
that's known to both the sender and receiver, or the equivalent (such as the base number to use for a similar split and join based on arithmetic, roughly equivalent to 10**part_size
given the way you're using part_size
), the task becomes quite a bit harder. If the receiver is initially clueless about this, why not just place the part_size
(or base, etc) as the very first int in the arr
list that's being sent and received? That way, the encoding trivially becomes "self-sufficient", i.e., doesn't need any supplementary parameter known to both sender and receiver.
Upvotes: 2