Qubix
Qubix

Reputation: 4353

Partition a list into groups by specifying group ID

I have a list, say:

my_list = [1,2,3,4,5,6,7]

and I want to create another list with group IDs, for example, if I split my_list into 2 groups, the group IDs list would be:

group_id = [1,1,1,2,2,2,2]

where the last group contains 1 more element due to the fact that my_list contains a prime number of elements.

Here are all possible splits, from 7 groups to 1 group:

group_id = [1,2,3,4,5,6,7]  # 7 groups, 7 elements
group_id = [1,1,2,3,4,5,6]  # 6 groups, 7 elements
group_id = [1,1,2,2,3,4,5]  # 5 groups, 7 elements
group_id = [1,1,2,2,3,3,4]  # 4 groups, 7 elements
group_id = [1,1,1,2,2,2,3]  # 3 groups, 7 elements
group_id = [1,1,1,2,2,2,2]  # 2 groups, 7 elements
group_id = [1,1,1,1,1,1,1]  # 1 group, 7 elements.

The problem is I don't know how to enforce this condition that the groups need to be more or less equal in size. This has to work for whatever list length (not only 7) and provide me group_id lists of a given number of groups (from 1 to len(my_list)).

EDIT:

This seems to "somewhat" work, though not fully:

import math

my_list = [1,2,3,4,5,6,7]
n_groups = 3


k = math.ceil(len(my_list)/n_groups)
chunks = [my_list[x:x+k] for x in range(0, len(my_list), k)]
group_id_nested = [[chunks.index(i)+1]*len(i) for i in chunks]
group_id = [item for sublist in group_id_nested for item in sublist]

Upvotes: 1

Views: 627

Answers (2)

whackamadoodle3000
whackamadoodle3000

Reputation: 6748

Without numpy:

import math
listy = [1,2,3,4,5,6,7]
groupnum = 6

smallnum = math.floor(len(listy)/groupnum)
bignum = math.ceil(len(listy)/groupnum)

grouped = []
for c,e in enumerate(range(1,groupnum+1)):
    if c<(len(listy)/groupnum - smallnum)*groupnum:
        grouped+=[e]*bignum
    else:
        grouped+=[e]*smallnum
grouped=grouped[0:len(listy)]
print(grouped)

Also, it distributes them evenly. For example, it would do [1, 1, 2, 2, 3, 3, 4] instead of [1, 2, 2, 2, 3, 4, 4] to divide into 4 groups.

Upvotes: 0

Daweo
Daweo

Reputation: 36570

If you are allowed to use numpy you might harness numpy.linspace for that task following way:

import numpy as np
list_size = 7
for n in range(1,list_size+1):
    group_ids = list(map(int,map(round,np.linspace(1,n,num=list_size))))
    print(group_ids)

Output:

[1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 2, 2, 2, 2]
[1, 1, 2, 2, 2, 3, 3]
[1, 2, 2, 2, 3, 4, 4]
[1, 2, 2, 3, 4, 4, 5]
[1, 2, 3, 4, 4, 5, 6]
[1, 2, 3, 4, 5, 6, 7]

Idea is simple: I get evenly spaced (float) numbers (list_size of them) from 1 (inclusive) to n (inclusive) and then get nearest integer for each (note usage of round). As you can see in each line of output difference between any two group sizes (number of occurence of given number) is always equal to or less 2.

Upvotes: 1

Related Questions