How to sum up values in a dataframe and add them to another one?

Question

I have two dataframes, one for individual transactions and another for the chart of accounts.

I'm trying to sum up all transactions for the last month (in this case, March) for each CompanyKey. I then want to add this result as a new column to the chart of accounts dataframe with the CompanyKey as the column header.

Here is a small sample of the transaction data (In reality there are thousands of transactions):

import pandas as pd

df = pd.DataFrame({
        'CompanyKey': ["1","1","1","1","1","1","1","2","2","2"],
        'DateOccurred': ["31/12/2021","25/02/2022","15/03/2022","31/03/2022","31/12/2021","22/02/2022","16/03/2022","31/12/2021","25/02/2022","31/03/2022"],
        'Account.Name': ["Cash at Bank","Cash at Bank","Cash at Bank","Cash at Bank","GST Paid","GST Paid","GST Paid","Cash at Bank","Cash at Bank","Cash at Bank"],
        'Amount': [150,112200,234065,19167.08,-39080.03,-10200,-27.5,15000,-234567,340697]})

Here are the corresponding chart of accounts:

df1 = pd.DataFrame({
            'ConsolidatedAccountName': ["Cash at Bank","GST Paid", "Cash at Bank", "GST Paid"],
            'Level 1': ["Fund Statement","Fund Statement", "Cash Flow Statement", "Cash Flow Statement"],
            'Level 2': ["Cash at Bank","GST Paid", "Cash at Bank", "GST Paid"]})

This is my desired result. I only want the Sums to be applied to rows which have a df['Level 1'] == "Fund Statement".

+──────────────────────────+──────────────────────+───────────────+────────────────+────────────────+
| ConsolidatedAccountName  | Level 1              | Level 2       | Company 1 Sum  | Company 2 Sum  |
+──────────────────────────+──────────────────────+───────────────+────────────────+────────────────+
| Cash at Bank             | Fund Statement       | Cash at Bank  | 253,232.08     | 340,697        |
| GST Paid                 | Fund Statement       | GST Paid      | -27.50         | 0              |
| Cash at Bank             | Cash Flow Statement  | Cash at Bank  | NaN            | NaN            |
| GST Paid                 | Cash Flow Statement  | GST Paid      | NaN            | NaN            |
+──────────────────────────+──────────────────────+───────────────+────────────────+────────────────+

This is about as far as I got before running into issues.

company_keys = [1, 2]
    
for company in company_keys:
    d1['Company 1 Sum'] = np.where((d3['CompanyKey'] == company) &
                                       (d3['DateOccurred'] >= '01/03/2022') & 
                                       (d3['DateOccurred'] <= '31/03/2022') &
                                       (d1['Level 1'] == 'Fund Statement'),
                                        d3['Amount'].sum(),
                                        0)

This is the error I get.

    ValueError: Length of values (10) does not match length of index (4)

How to sum up values in a dataframe and add them to another one?

Answers (1)

Related Questions