Grouping Multiple Columns into a Few Columns-Pandas Dataframes

Question

I have a dataframe that has dummy variables; however, I want to group multiple columns(100+) into a few amount of columns. Any suggestion(s) would be much appreciated. Thanks! For example:

   A_1 | A_2 | A_3| B_1|  B_2|  B_3|  C_1|  C_2|  C_2|
0 |   0|    1|   0|   0|    0|    0|    0|    0|    0|
1 |   0|    0|   0|   0|    0|    1|    0|    1|    0|
2 |   0|    0|   0|   0|    0|    0|    1|    0|    0|
3 |   0|    0|   0|   0|    1|    0|    0|    0|    0|
4 |   1|    0|   0|   0|    0|    0|    0|    0|    0|
5 |   0|    0|   1|   0|    0|    0|    0|    1|    0|
6 |   0|    0|   0|   1|    0|    0|    0|    0|    0|

Desired Output:

   A|  B|  C|  
0| 1|  0|  0|      
1| 0|  1|  1|       
2| 0|  0|  1|       
3| 0|  1|  0|       
4| 1|  0|  0|       
5| 1|  0|  1|      
6| 0|  1|  0|

I've tried using this code; however, I keep running into error messages saying that the column name is missing.

categories = {'A':'A','B': 'B','C': 'C'}
    def correct_categories(cols1):
        return [categories[cat] for col1 in cols1 for cat in categories.keys() if col1.startswith(cat)]
        
rslt = df3.groupby(correct_categories(df3.columns),axis=1).sum()
print(rslt)

Error Message: KeyError: 'A'

Grouping Multiple Columns into a Few Columns-Pandas Dataframes

Answers (1)

Related Questions