LOKE2707
LOKE2707

Reputation: 312

Groupby for large number columns in pandas

I am trying to loop through multiple excel files in pandas. The structure of the files are very much similar, the first 10 column forms a key and rest of the columns have the values. I want to group by first 10 columns and sum the rest.

I have searched and found solutions online for similar cases but my problem is that

I can't share the actual data sample but here is the format sample of the file structure

enter image description here

and here is the desired output from the above data

enter image description here

It is like a groupby operation but with uncertain large number of columns and header name makes it difficult to use groupby or pivot. Can Any one suggest me what is the best possible solution for it in python.

Edited:

df.groupby(list(df.columns[:11])).agg(sum)

is working but for some reason it is taking 25-30 mins. the same thing MS Access is done in 1-2 mins . Am I doing something wrong here or is there any other way to do it in python itself

Upvotes: 3

Views: 1926

Answers (1)

filbranden
filbranden

Reputation: 8898

Just use df.columns which has the list of columns, you can then use a slice on that list to get the 10 leftmost columns.

This should work:

df.groupby(df.columns[:10].to_list()).sum()

Upvotes: 1

Related Questions