Reputation:
I have to split the data into four groups and each split data into groups and subgroups like the following. How do I write this as a single for loop?
What I tried so far?
df = pd.DataFrame({'Data':[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]})
n = round(len(df)/4)
groups = [df[i:i+n] for i in range(0,df.shape[0],n)]
first_group = groups[0]
second_group = groups[1]
third_group = groups[2]
fourth_group = groups[3]
split1 = first_group.copy().reset_index(drop=True)
split1['Group'] = 'A'
split1['Sub group'] = pd.Series(range(101, 105))
split2 = second_group.copy().reset_index(drop=True)
split2['Group'] = 'B'
split2['Sub group'] = pd.Series(range(201, 205))
split3 = third_group.copy().reset_index(drop=True)
split3['Group'] = 'C'
split3['Sub group'] = pd.Series(range(301, 305))
split4 = fourth_group.copy().reset_index(drop=True)
split4['Group'] = 'D'
split4['Sub group'] = pd.Series(range(401, 405))
n_split = pd.concat([split1, split2, split3, split4])
Output should look something like the following table:
Data Group Sub group
0 1 A 101
1 2 A 102
2 3 A 103
3 4 A 104
0 5 B 201
1 6 B 202
2 7 B 203
3 8 B 204
0 9 C 301
1 10 C 302
2 11 C 303
3 12 C 304
0 13 D 401
1 14 D 402
2 15 D 403
3 16 D 404
Upvotes: 0
Views: 80
Reputation: 23099
no need for loops here.
we can use map
and cumcount
personally, i would set 4
as a constant variable so you play around with the divmod
of your index to make it totally dynamic.
import pandas as pd
from string import ascii_uppercase
df = pd.DataFrame({'Data':[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]})
idx = set(df.index // 4)
df['group'] = (df.index // 4).map(dict(zip(idx, ascii_uppercase )))
df['subgroups'] = (((df.index // 4) + 1) * 100) + df.groupby('group').cumcount() + 1
print(df)
Data group subgroups
0 1 A 101
1 2 A 102
2 3 A 103
3 4 A 104
4 5 B 201
5 6 B 202
6 7 B 203
7 8 B 204
8 9 C 301
9 10 C 302
10 11 C 303
11 12 C 304
12 13 D 401
13 14 D 402
14 15 D 403
15 16 D 404
Upvotes: 2
Reputation: 1049
Here you go: ord
method allows you to loop through the letters (be aware of the limit of course!)
import pandas as pd
df = pd.DataFrame({'Data':[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]})
group_ord = ord('A')
sub_num = 101
groups_list = []
sub_group_list = []
for i, row in df.iterrows():
groups_list.append(chr(group_ord))
sub_group_list.append(sub_num)
if row['Data'] % 4 == 0:
group_ord += 1
sub_num += 100 - 3
else:
sub_num += 1
df['Groups'] = groups_list
df['Sub-groups'] = sub_group_list
df
Data Groups Sub-groups
0 1 A 101
1 2 A 102
2 3 A 103
3 4 A 104
4 5 B 201
5 6 B 202
6 7 B 203
7 8 B 204
8 9 C 301
9 10 C 302
10 11 C 303
11 12 C 304
12 13 D 401
13 14 D 402
14 15 D 403
15 16 D 404
Upvotes: 0