Reputation: 4353
I have a list, say:
my_list = [1,2,3,4,5,6,7]
and I want to create another list with group IDs, for example, if I split my_list
into 2 groups, the group IDs list would be:
group_id = [1,1,1,2,2,2,2]
where the last group contains 1 more element due to the fact that my_list
contains a prime number of elements.
Here are all possible splits, from 7 groups to 1 group:
group_id = [1,2,3,4,5,6,7] # 7 groups, 7 elements
group_id = [1,1,2,3,4,5,6] # 6 groups, 7 elements
group_id = [1,1,2,2,3,4,5] # 5 groups, 7 elements
group_id = [1,1,2,2,3,3,4] # 4 groups, 7 elements
group_id = [1,1,1,2,2,2,3] # 3 groups, 7 elements
group_id = [1,1,1,2,2,2,2] # 2 groups, 7 elements
group_id = [1,1,1,1,1,1,1] # 1 group, 7 elements.
The problem is I don't know how to enforce this condition that the groups need to be more or less equal in size. This has to work for whatever list length (not only 7) and provide me group_id
lists of a given number of groups (from 1 to len(my_list)
).
EDIT:
This seems to "somewhat" work, though not fully:
import math
my_list = [1,2,3,4,5,6,7]
n_groups = 3
k = math.ceil(len(my_list)/n_groups)
chunks = [my_list[x:x+k] for x in range(0, len(my_list), k)]
group_id_nested = [[chunks.index(i)+1]*len(i) for i in chunks]
group_id = [item for sublist in group_id_nested for item in sublist]
Upvotes: 1
Views: 627
Reputation: 6748
Without numpy:
import math
listy = [1,2,3,4,5,6,7]
groupnum = 6
smallnum = math.floor(len(listy)/groupnum)
bignum = math.ceil(len(listy)/groupnum)
grouped = []
for c,e in enumerate(range(1,groupnum+1)):
if c<(len(listy)/groupnum - smallnum)*groupnum:
grouped+=[e]*bignum
else:
grouped+=[e]*smallnum
grouped=grouped[0:len(listy)]
print(grouped)
Also, it distributes them evenly. For example, it would do [1, 1, 2, 2, 3, 3, 4]
instead of [1, 2, 2, 2, 3, 4, 4]
to divide into 4 groups.
Upvotes: 0
Reputation: 36570
If you are allowed to use numpy
you might harness numpy.linspace for that task following way:
import numpy as np
list_size = 7
for n in range(1,list_size+1):
group_ids = list(map(int,map(round,np.linspace(1,n,num=list_size))))
print(group_ids)
Output:
[1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 2, 2, 2, 2]
[1, 1, 2, 2, 2, 3, 3]
[1, 2, 2, 2, 3, 4, 4]
[1, 2, 2, 3, 4, 4, 5]
[1, 2, 3, 4, 4, 5, 6]
[1, 2, 3, 4, 5, 6, 7]
Idea is simple: I get evenly spaced (float) numbers (list_size
of them) from 1
(inclusive) to n
(inclusive) and then get nearest integer for each (note usage of round
). As you can see in each line of output difference between any two group sizes (number of occurence of given number) is always equal to or less 2.
Upvotes: 1