Goncalves
Goncalves

Reputation: 179

Is it possible to fill the empty cells without using a for loop?

I have a sample a of dataframe with banking data. I would like to know if it is possible to fill the empty cells without using a for loop.

In this example, let's say that at the row number 2 (pythonic way), it should take the value of the balance at the previous row 52867,36 and add the amount of the row number 2 : 847.00.

This happens when there are several transactions on the same date.

It is easy with a for loop but I would like to know if there is a way to do it by a vectorisation.

The dataframe
import pandas as pd

l1 = ['26.10.2022', '27.10.2022', '28.10.2022', '28.10.2022', '28.10.2022','28.10.2022', '31.10.2022', '31.10.2022', '01.11.2022', '01.11.2022', '03.11.2022',  '04.11.2022', '07.11.2022', '07.11.2022', '07.11.2022', '08.11.2022', '09.11.2022', '09.11.2022']
l2 = [54267.36,52867.36, '','' , '',52744.21,'' ,52646.91,'',34898.36,34871.46,51026.46,'','',50612.36,61468.52,'',69563.27]
l3 = [-390,-1400,847,-900.15,-45,-25,-57.3,-40,-12528.55,-5220,-26.9,16155,-275,-105,-34.1,10856.16,7663.95,430.8]


df = pd.DataFrame(list(zip(l1,l2,l3)), columns = ['Date','Balance','Amount'])
print(df)


          Date   Balance    Amount
0   26.10.2022  54267.36   -390.00
1   27.10.2022  52867.36  -1400.00
2   28.10.2022              847.00
3   28.10.2022             -900.15
4   28.10.2022              -45.00
5   28.10.2022  52744.21    -25.00
6   31.10.2022              -57.30
7   31.10.2022  52646.91    -40.00
8   01.11.2022           -12528.55
9   01.11.2022  34898.36  -5220.00
10  03.11.2022  34871.46    -26.90
11  04.11.2022  51026.46  16155.00
12  07.11.2022             -275.00
13  07.11.2022             -105.00
14  07.11.2022  50612.36    -34.10
15  08.11.2022  61468.52  10856.16
16  09.11.2022             7663.95
17  09.11.2022  69563.27    430.80

Upvotes: 1

Views: 87

Answers (2)

Eric Mendes
Eric Mendes

Reputation: 55

I think you should go with the pandas solution @Ynjxsjmh posted above, but I went for the stdlib's itertools.

import pandas as pd
from itertools import accumulate

l1 = ['26.10.2022', '27.10.2022', '28.10.2022', '28.10.2022', '28.10.2022','28.10.2022', '31.10.2022', '31.10.2022', '01.11.2022', '01.11.2022', '03.11.2022',  '04.11.2022', '07.11.2022', '07.11.2022', '07.11.2022', '08.11.2022', '09.11.2022', '09.11.2022']
l2 = [54267.36,52867.36, '','' , '',52744.21,'' ,52646.91,'',34898.36,34871.46,51026.46,'','',50612.36,61468.52,'',69563.27]
l3 = [-390,-1400,847,-900.15,-45,-25,-57.3,-40,-12528.55,-5220,-26.9,16155,-275,-105,-34.1,10856.16,7663.95,430.8]


df = pd.DataFrame(list(zip(l1,l2,l3)), columns = ['Date','Balance','Amount'])
df["Balance"] = df["Balance"].apply(lambda x: None if x == '' else x).astype(float)
df["Balance"] = [df.loc[0, "Balance"]] + list(accumulate(df.loc[2:, "Amount"], initial=df.loc[1, 'Balance']))

print(df)

This gives:

          Date   Balance    Amount
0   26.10.2022  54267.36   -390.00
1   27.10.2022  52867.36  -1400.00
2   28.10.2022  53714.36    847.00
3   28.10.2022  52814.21   -900.15
4   28.10.2022  52769.21    -45.00
5   28.10.2022  52744.21    -25.00
6   31.10.2022  52686.91    -57.30
7   31.10.2022  52646.91    -40.00
8   01.11.2022  40118.36 -12528.55
9   01.11.2022  34898.36  -5220.00
10  03.11.2022  34871.46    -26.90
11  04.11.2022  51026.46  16155.00
12  07.11.2022  50751.46   -275.00
13  07.11.2022  50646.46   -105.00
14  07.11.2022  50612.36    -34.10
15  08.11.2022  61468.52  10856.16
16  09.11.2022  69132.47   7663.95
17  09.11.2022  69563.27    430.80

Upvotes: 0

Ynjxsjmh
Ynjxsjmh

Reputation: 30050

You can cumsum on Amount column to get difference to the first value of Balance then fillna value in Balance column

df['Balance'] = (pd.to_numeric(df['Balance'])
                 .fillna(df['Amount'].shift(-1).cumsum().add(df.iloc[0]['Balance']).shift(1)))
print(df)

          Date   Balance    Amount
0   26.10.2022  54267.36   -390.00
1   27.10.2022  52867.36  -1400.00
2   28.10.2022  53714.36    847.00
3   28.10.2022  52814.21   -900.15
4   28.10.2022  52769.21    -45.00
5   28.10.2022  52744.21    -25.00
6   31.10.2022  52686.91    -57.30
7   31.10.2022  52646.91    -40.00
8   01.11.2022  40118.36 -12528.55
9   01.11.2022  34898.36  -5220.00
10  03.11.2022  34871.46    -26.90
11  04.11.2022  51026.46  16155.00
12  07.11.2022  50751.46   -275.00
13  07.11.2022  50646.46   -105.00
14  07.11.2022  50612.36    -34.10
15  08.11.2022  61468.52  10856.16
16  09.11.2022  69132.47   7663.95
17  09.11.2022  69563.27    430.80

Upvotes: 2

Related Questions