Reputation: 1190
I have this dataframe:
data = pd.DataFrame({'UserName':['LoveLearn','JakeSanz','LoveLearn'],'Alias':['LL','JS','LL'],'ClassRoom1':['A2','3B','C2'],'ClassRoom2':['B5','E6','D2'],'Points':[1,6,2]})
I want to group by UserName, Alias and sum the points (done) and get a list of all the classrooms a user has attended.
First I filter the classrom columns by name:
classroom_columns = list(data.filter(regex='ClassRoom*').columns)
I group the data:
grouped_data = data.groupby(['UserName','Alias'])
Define this function:
def group_metrics(g_df,class_cols):
return pd.DataFrame({'TotalPoints':g_df['Points'].sum(),'TotalClassRooms':g_df.apply(lambda x: x[class_cols].values.tolist())})
But after calling the function
group_metrics(grouped_data,classroom_columns)
I get a list of lists on the TotalClassRooms:
UserName Alias TotalPoints TotalClassRooms
0 JakeSanz JS 6 [[3B, E6]]
1 LoveLearn LL 3 [[A2, B5], [C2, D2]]
I would want a single list.
Upvotes: 2
Views: 98
Reputation: 35686
Can use np.ravel
before tolist
to flatten the DataFrame into 1D:
import numpy as np
def group_metrics(g_df, class_cols):
return pd.DataFrame({
'TotalPoints': g_df['Points'].sum(),
'TotalClassRooms': g_df.apply(
lambda x: np.ravel(x[class_cols]).tolist())
})
Or flatten
:
def group_metrics(g_df, class_cols):
return pd.DataFrame({
'TotalPoints': g_df['Points'].sum(),
'TotalClassRooms': g_df.apply(
lambda x: x[class_cols].values.flatten().tolist())
})
group_metrics(grouped_data, classroom_columns)
TotalPoints TotalClassRooms
UserName Alias
JakeSanz JS 6 [3B, E6]
LoveLearn LL 3 [A2, B5, C2, D2]
Upvotes: 1