Pandas Sum up even if column values were not unique

I have the following dataframe:

df = pd.DataFrame({'ISIN': ['A1kT23', 'A1kT23', 'B333', '49O33'],
                   'Name': ['Example A', 'Name Xy', 'Example B', 'Test123'],
                   'Sector': ['Energy', 'Energy', 'Utilities', 'Real Estate'],
                   'Value': [100, 50, 7, 23]})
     ISIN       Name       Sector  Value
0  A1kT23  Example A       Energy    100
1  A1kT23    Name Xy       Energy     50
2    B333  Example B    Utilities      7
3   49O33    Test123  Real Estate     23

I would love to sum up by ISIN. The problem is the name is not unique. I would love to use the first name (Example A) instead of the second (Name Xy), when I sum up, so that I get the following result:

df = pd.DataFrame({'ISIN': ['A1kT23', 'B333', '49O33'],
                   'Name': ['Example A', 'Example B', 'Test123'],
                   'Sector': ['Energy', 'Utilities', 'Real Estate'],
                   'Value': [150, 7, 23]})
     ISIN       Name       Sector  Value
0  A1kT23  Example A       Energy    150
1    B333  Example B    Utilities      7
2   49O33    Test123  Real Estate     23

How can I solve this?

Upvotes: 0

Views: 37

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35636

Let's try with groupby aggregate and keep the 'first' value for Name and Sector and 'sum' Value:

df = df.groupby('ISIN', as_index=False, sort=False).agg({
    'Name': 'first', 'Sector': 'first', 'Value': 'sum'
})

df:

     ISIN       Name       Sector  Value
0  A1kT23  Example A       Energy    150
1    B333  Example B    Utilities      7
2   49O33    Test123  Real Estate     23

Upvotes: 2

Related Questions