Groupby for large number columns in pandas

Question

I am trying to loop through multiple excel files in pandas. The structure of the files are very much similar, the first 10 column forms a key and rest of the columns have the values. I want to group by first 10 columns and sum the rest.

I have searched and found solutions online for similar cases but my problem is that

I have large number of columns with values ( to be aggregate as sum) and
Number / names of columns(with values) is different for each file(dataframe)

#Key columns are same across all the files.

I can't share the actual data sample but here is the format sample of the file structure

and here is the desired output from the above data

It is like a groupby operation but with uncertain large number of columns and header name makes it difficult to use groupby or pivot. Can Any one suggest me what is the best possible solution for it in python.

Edited:

df.groupby(list(df.columns[:11])).agg(sum)

is working but for some reason it is taking 25-30 mins. the same thing MS Access is done in 1-2 mins . Am I doing something wrong here or is there any other way to do it in python itself

Groupby for large number columns in pandas

Answers (1)

Related Questions