Reputation: 1647
Sample of data, actual data has many years. The type "Lien" or "Lien Endorsement" can only appear once per year. Other types can repeat within a year.
tax_allyears =
tax_year type amount
2013 Lien Interest 4
2014 Lien Interest 10
2014 Lien 100
2014 Lien Interest 15
2013 Lien Endorsement 200
This line almost works, it sums the "Lien Interest" values by year.
by_year_interest = tax_allyears_1[tax_allyears_1['type'] == 'Lien Interest'].groupby(by=['tax_year'])['amount'].sum()
What I want is to distinguish between years that have "Lien" vs "Lien Interest"
by_year_Lien_interest = some function
tax_year amount
2014 25
by_year_Lien_Endorsement_interest = some function
tax_year amount
2013 4
Upvotes: 3
Views: 4841
Reputation: 6589
If tax_year
, type
and amount
are the names of the columns
in a DataFrame
, then you can do this:
# Create a groupby object
name = df.groupby(['tax_year', 'type'])
# Apply the sum function to the groupby object
df = name.sum()
Upvotes: 0
Reputation: 90869
You can first create two different list of unique years, one where Lien
comes, and the other where Lien Endorsement
comes. And then use those unique lists in your condition for filtering the tax_allyears
DataFrame using Series.isin
. Example -
lienyears = tax_allyears.loc[tax_allyears['type'] == 'Lien','tax_year'].unique().tolist()
lienendorsementyears = tax_allyears.loc[tax_allyears['type'] == 'Lien Endorsement','tax_year'].unique().tolist()
by_year_lien_interest = tax_allyears[(tax_allyears['type'] == 'Lien Interest') & tax_allyears['tax_year'].isin(lienyears)].groupby('tax_year')['amount'].sum()
by_year_lien_endorsement_interest = tax_allyears[(tax_allyears['type'] == 'Lien Interest') & tax_allyears['tax_year'].isin(lienendorsementyears)].groupby('tax_year')['amount'].sum()
Demo -
In [7]: tax_allyears
Out[7]:
tax_year type amount
0 2013 Lien Interest 4
1 2014 Lien Interest 10
2 2014 Lien 100
3 2014 Lien Interest 15
4 2013 Lien Endorsement 200
In [9]: lienyears = tax_allyears.loc[tax_allyears['type'] == 'Lien','tax_year'].unique().tolist()
In [10]: lienendorsementyears = tax_allyears.loc[tax_allyears['type'] == 'Lien Endorsement','tax_year'].unique().tolist()
In [13]: by_year_lien_interest = tax_allyears[(tax_allyears['type'] == 'Lien Interest') & tax_allyears['tax_year'].isin(lienyears)].groupby('tax_year')['amount'].sum()
In [15]: by_year_lien_endorsement_interest = tax_allyears[(tax_allyears['type'] == 'Lien Interest') & tax_allyears['tax_year'].isin(lienendorsementyears)].groupby('tax_year')['amount'].sum()
In [16]: by_year_lien_interest
Out[16]:
tax_year
2014 25
Name: amount, dtype: int64
In [17]: by_year_lien_endorsement_interest
Out[17]:
tax_year
2013 4
Name: amount, dtype: int64
Upvotes: 1