L85376
L85376

Reputation: 21

Sorting List of Integers into List of Lists by Digit Sums

I am trying to write a Python function to sort a list of numbers into a list of lists of numbers with each sublist only containing numbers that have the digit sum of the index of the sub list in the larger list.

So, for example, for all of the numbers from 1 to 25, it should yield a list of lists like this:

[[], [1, 10], [2, 11, 20], [3, 12, 21], [4, 13, 22], [5, 14, 23], [6, 15, 24], [7, 16], [8, 17], [9, 18], [19]]

I have the following code so far:

def digit_sum(integer_data_type):
    int_string = str(integer_data_type)
    sum = 0
    for digits in int_string:
        sum += int(digits)
    return sum


def organize_by_digit_sum(integer_list):
    integer_list.sort()
    max_ds = 9*len(str(max(integer_list)))+1
    list_of_lists = []
    current_ds = 0
    while current_ds <= max_ds:
            current_list = []
            for n in integer_list:
                    if digit_sum(n) == current_ds:
                            current_list.append(n)
            list_of_lists.append(current_list)
            current_ds += 1
    return list_of_lists

Obviously, this is inefficient because it has to loop through the entire integer list over and over for each digit sum from 0 through the maximum digit sum.

Also, it initially assumes the maximum digit sum is 9 times the length of the maximum integer. To be clear, I do want to always have a sublist for the possible digit_sum of zero so that I can refer to a particular digit sum's sublist by the index of the list of lists.

I want the function only have to loop through each integer in the list exactly once and append it to the correct sublist.

I would appreciate any help or insights about this.

Upvotes: 0

Views: 991

Answers (4)

akuiper
akuiper

Reputation: 215057

If you don't mind using itertools, here is a way that should be more efficient.

from itertools import groupby
digit_sum = lambda x: sum(int(i) for i in str(x))
[list(g) for _, g in groupby(sorted(range(1,26), key = digit_sum), key = digit_sum)]
                                  # ^^^^^^^^^^ replace this with your actual data
# [[1, 10],
#  [2, 11, 20],
#  [3, 12, 21],
#  [4, 13, 22],
#  [5, 14, 23],
#  [6, 15, 24],
#  [7, 16, 25],
#  [8, 17],
#  [9, 18],
#  [19]]

The way it works here: use sorted() to sort your original list by the digit sum of the integers so that you can use groupby() method to group your list by the digit sum and then loop through the groups and convert the integers in each group to a list.

Update: To get list where the digit sum of the sub list is equal to the index, you can firstly create a dictionary:

dict_ = dict((k,list(g)) for k, g in groupby(sorted(range(1,26), key = digit_sum), key = digit_sum))

dict_
# {1: [1, 10],
#  2: [2, 11, 20],
#  3: [3, 12, 21],
#  4: [4, 13, 22],
#  5: [5, 14, 23],
#  6: [6, 15, 24],
#  7: [7, 16, 25],
#  8: [8, 17],
#  9: [9, 18],
#  10: [19]}

[dict_.get(key, []) for key in range(max(dict_.keys()))]
# [[],
#  [1, 10],
#  [2, 11, 20],
#  [3, 12, 21],
#  [4, 13, 22],
#  [5, 14, 23],
#  [6, 15, 24],
#  [7, 16, 25],
#  [8, 17],
#  [9, 18]]

Upvotes: 2

Moses Koledoye
Moses Koledoye

Reputation: 78554

The following loops exactly once on the data and returns a dictionary whose keys are the sums, and values are the items that correspond to that sum:

from collections import defaultdict
from pprint import pprint

def group_by_sum(lst):
    d = defaultdict(list)
    for i in lst:
        d[sum(int(j) for j in str(i))].append(i)
    return d

pprint(group_by_sum(range(1, 25)))
# {1: [1, 10],
#  2: [2, 11, 20],
#  3: [3, 12, 21],
#  4: [4, 13, 22],
#  5: [5, 14, 23],
#  6: [6, 15, 24],
#  7: [7, 16],
#  8: [8, 17],
#  9: [9, 18],
#  10: [19]}

You can sort the dictionary values based on the sums to have a list, but I think keeping your data as a dictionary might serve you better.

Upvotes: 2

Eric
Eric

Reputation: 75

Very easy:

list_of_lists = [[] for i in range(11)]

for i in range(25):
    digit_sum = sum(int(i) for i in str(i))
    list_of_lists[digit_sum].append(i)

print (list_of_lists)

Upvotes: 0

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 96171

If you want a solution that leaves empty lists, and space efficiency isn't your main concern, I would use a list of tuples:

>>> def digit_sum(digits):
...   total = 0
...   while digits != 0:
...     total += digits % 10
...     digits = digits // 10
...   return total
... 
>>> numbers = list(range(1,26))
>>> pairs = sorted((digit_sum(n),n) for n in numbers)
>>> pairs
[(1, 1), (1, 10), (2, 2), (2, 11), (2, 20), (3, 3), (3, 12), (3, 21), (4, 4), (4, 13), (4, 22), (5, 5), (5, 14), (5, 23), (6, 6), (6, 15), (6, 24), (7, 7), (7, 16), (7, 25), (8, 8), (8, 17), (9, 9), (9, 18), (10, 19)]
>>> maximum_sum = pairs[-1][0]
>>> list_of_lists = [[] for _ in range(maximum_sum+1)]
>>> for pair in pairs:
...   list_of_lists[pair[0]].append(pair[1])
... 
>>> list_of_lists
[[], [1, 10], [2, 11, 20], [3, 12, 21], [4, 13, 22], [5, 14, 23], [6, 15, 24], [7, 16, 25], [8, 17], [9, 18], [19]]
>>> 

So, suppose your data is much more sparse:

>>> numbers = [4,25,47,89]
>>> pairs = sorted((digit_sum(n),n) for n in numbers)
>>> pairs
[(4, 4), (7, 25), (11, 47), (17, 89)]
>>> maximum_sum = pairs[-1][0]
>>> list_of_lists = [[] for _ in range(maximum_sum+1)]
>>> for pair in pairs:
...   list_of_lists[pair[0]].append(pair[1])
... 
>>> from pprint import pprint
>>> pprint(list_of_lists,width=2)
[[],
 [],
 [],
 [],
 [4],
 [],
 [],
 [25],
 [],
 [],
 [],
 [47],
 [],
 [],
 [],
 [],
 [],
 [89]]
>>> 

And you can access your data as such:

>>> list_of_lists[17]
[89]
>>> list_of_lists[8]
[]
>>> 

Upvotes: 0

Related Questions