Reputation: 673
I applied groupby on a dataframe
df.groupby('Category').sum()
after which the result dataframe looks like this
height weight
General 42.849980 157.500553
GENERAL 49.607315 177.340407
Genera 56.293531 171.524640
CategoryA 48.421077 144.251986
CategoryB 48.421077 144.251986
CategoryC 48.421077 144.251986
I need to group General, GENERAL and Genera in a single row and the result to look like
General 123.849980 300.500553
CategoryA 48.421077 144.251986
CategoryB 48.421077 144.251986
CategoryC 48.421077 144.251986
How can I accomplish this ?
Edit: Got the solution with the regex. Is there any way if I need to categorize General, GENERAL, Genera and CategoryA into a single group ?
Upvotes: 2
Views: 3586
Reputation: 2117
Assuming that the category you are grouping by is in the index, you can do:
import re
result = (
df
.groupby(df.index.str.replace("genera.*", "General", flags=re.IGNORECASE))
.sum()
)
Edit: If you don't want to use regex, you can use a different approach with .map
. In the example below I assume that your categories are in a column named Category
:
mapping = {
"General": "CategoryA",
"GENERAL": "CategoryA",
"Genera": "CategoryA",
}
result = (
df
.groupby(df.Category.map(mapping).fillna(df.Category))
.sum()
)
Upvotes: 4