Group by and join columns in Pandas Dataframe

Question

For every column in the list cat_column, I need to loop over the list numerical_cols and get the mean and standard deviation. I have the code below that does it. But in the end of second loop, I need to have a final table with the respective cat_column and mean and standard deviation of the all numerical columns like below.

Code1 Code2 Mean_Code1_CarAge  Std_Code1_CarAge  Mean_Code1_CarPrice   Std_Code1_CarPrice Mean_Code2_CarAge  Std_Code2_CarAge  Mean_Code2_CarPrice   Std_Code2_CarPrice

Code:

cat_column = ["Code1", "Code2"]
numerical_cols = ['CarAge', 'CarPrice']

for base_col in cat_column :
  for col in numerical_cols:
    df = df.groupby(base_col)[col].agg([np.mean, np.std]).reset_index().rename(
        columns={'mean': 'mean_'+base_col+"_"+col, 'std': 'std_'+base_col+"_"+col})

Input:

     Code1 Code2 CarAge CarPrice
      AAA   AA1      12    5000 
      BBB   BB1      30   10000 
      CCC   CC1      64   22000 
      AAA   AA1      19    4000 
      BBB   BB1      49   10000

Output:

   Code1  Code2 Mean_Code1_CarAge Std_Code1_CarAge Mean_Code1_CarPrice Std_Code1_CarPrice Mean_Code2_CarAge Std_Code2_CarAge Mean_Code2_CarPrice Std_Code2_CarPrice
   AAA   AA1  15.5      4.95     4500    707.10   15.5      4.95     4500    707.10
   BBB   BB1  9.5      13.43    10000   0.00   9.5      13.43    10000   0.00
   CCC   CC1  64.0      NaN      22000   NaN   64.0      NaN      22000   NaN

Not sure how to do that dynamically in the above code. Any leads/suggestions would be appreciated.

Group by and join columns in Pandas Dataframe

Answers (1)

Related Questions