Snedecor
Snedecor

Reputation: 739

Excluding all Nulls values before given value (Pandas)

Being more explicit,

I have a pandas DataFrame likes this one:

╔══════╦═════╦════════╦══════╗
║ User ║ Age ║ Salary ║ Rate ║
╠══════╬═════╬════════╬══════╣
║    1 ║  22 ║ NaN    ║ 1.10 ║
║    1 ║  23 ║ 13     ║ 1.25 ║
║    1 ║  24 ║ NaN    ║ 1.23 ║
║    2 ║  21 ║ NaN    ║ 1.18 ║
║    2 ║  22 ║ 27     ║ 1.20 ║
║    2 ║  23 ║ NaN    ║ 1.30 ║
║    2 ║  24 ║ NaN    ║ 1.12 ║
║    3 ║  25 ║ 10     ║ 1.15 ║
║    3 ║  26 ║ NaN    ║ 1.10 ║
╚══════╩═════╩════════╩══════╝

And I need my final output to be like this one:

╔══════╦═════╦════════╦══════╗
║ User ║ Age ║ Salary ║ Rate ║
╠══════╬═════╬════════╬══════╣
║    1 ║  23 ║ 13     ║ 1.25 ║
║    1 ║  24 ║ 16.25  ║ 1.23 ║
║    2 ║  22 ║ 27     ║ 1.20 ║
║    2 ║  23 ║ 32.4   ║ 1.30 ║
║    2 ║  24 ║ 42.12  ║ 1.12 ║
║    3 ║  25 ║ 10     ║ 1.15 ║
║    3 ║  26 ║ 11.5   ║ 1.10 ║
╚══════╩═════╩════════╩══════╝

I have tried many things, but none of them work as I intended to :-(

The closest one was:

(df.groupby('User')['Rate']
.apply(lambda x: x.cumprod().shift())
.mul(df['Salary'].ffill())
.fillna(df['Salary']))

but didn' work.

Is there any way to acchieve this?

What I want is to calculate the cumulative product between salary and Rate for each User!

Upvotes: 0

Views: 37

Answers (1)

BENY
BENY

Reputation: 323316

You need first filter the dataframe , then fillna with Salary

df=df[df.groupby('User')['Salary'].ffill().notna()]
df.Salary.fillna(df.groupby('User')['Rate']
.apply(lambda x: x.cumprod().shift())
.mul(df['Salary'].ffill()))
Out[13]: 
1    13.00
2    16.25
4    27.00
5    32.40
6    42.12
7    10.00
8    11.50
Name: Salary, dtype: float64

Upvotes: 3

Related Questions