lucretiuss
lucretiuss

Reputation: 105

Addition with nested columns in python

I have a pandas groupby object that I made from a larger dataframe, in which amounts are grouped under a person ID variable as well as whether it was an ingoing or outgoing transaction. Heres an example:

ID In_Out Amount
1 In 5
1 Out 8
2 In 4
2 Out 2
3 In 3
3 Out 9
4 Out 8

(sorry I don't know how to put actual sample data in). Note that some folks can have one or the other (e.g., maybe they have some going out but nothing coming in).

All I want to go is get the difference in the amounts, collapsed under the person. So the ideal output would be, perhaps a dictionary or other dataframe, containing the difference in amounts under each person, like this:

ID Difference
1 -3
2 2
3 -6
4 -8

I have tried a handful of different ways to do this but am not sure how to work with these nested lists in python.

Thanks!

Upvotes: 1

Views: 59

Answers (1)

Anton vBR
Anton vBR

Reputation: 18916

We couold select the rows that are Out and convert them to negative integers and then use sum().

import pandas as pd

s = '''\
ID In_Out Amount
1  In     5
1  Out    8
2  In     4
2  Out    2
3  In     3
3  Out    9
4  Out    8'''

# Recreate dataframe
df = pd.read_csv(pd.compat.StringIO(s), sep='\s+')

# Select rows where In_Out == 'Out' and multiple by -1
df.loc[df['In_Out'] == 'Out', 'Amount'] *= -1

# Convert to dict
d = df.groupby('ID')['Amount'].sum().to_dict()
print(d)

Returns:

{1: -3, 2: 2, 3: -6, 4: -8}

Upvotes: 1

Related Questions