Group entries in Pandas data frame where rows have identical values

Question

I have a Pandas data frame where I want to group all rows that have the same values and group them by the index column.

Example:

data = {'Number':[5, 10, 15, 20, 25, 28],
        'Letter':['a','a','b','b','c','c'],
        'Type':['X','X','Y','Y','Z','Z']}
df = pd.DataFrame(data)
df = df.set_index('Number')

Output:

           Letter Type
Number            
5           a    X
10          a    X
15          b    Y
20          b    Y
25          c    Z
28          c    Z

My wanted output is:

[[5,10],[15,20],[25,28]]

jezrael · Accepted Answer

First idea is convert index to column and aggregate list:

print (df.reset_index().groupby(['Letter', 'Type'])['Number'].agg(list).tolist())
[[5, 10], [15, 20], [25, 28]]

Or you can use lambda function:

print (df.groupby(['Letter', 'Type']).apply(lambda x: x.index.tolist()).tolist())
[[5, 10], [15, 20], [25, 28]]

Group entries in Pandas data frame where rows have identical values

Answers (1)

Related Questions