Alex DiS
Alex DiS

Reputation: 49

Why fillna have no effect after several operators on dataframe series?

I have the dataframe which look like this:

df = pd.DataFrame({'Event': ['A', 'B', 'A', 'A', 'B', 'C', 'B', 'B', 'A', 'C'], 
                   'Direction': ['UP', 'DOWN', 'UP', 'UP', 'DOWN', 'DOWN', 'DOWN', 'UP', 'DOWN', 'UP'],
                   'group':[1,2,3,3,3,4,4,4,5,5]})

Everything works fine, when i do:

df['prev'] = df[(df.Event == 'A') & (df.Direction == 'UP')].groupby('group').cumcount().add(1)
df['prev'].fillna(0, inplace=True)

But if i do it in one line the fillna() function does not works:

df['prev'] = df[(df.Event == 'A') & (df.Direction == 'UP')].groupby('group').cumcount().add(1).fillna(0)

My questioni is: Why is that? And is there a way to do it in one line?

Upvotes: 0

Views: 56

Answers (3)

BeRT2me
BeRT2me

Reputation: 13242

Look at the output at this step:

print(df[(df.Event == 'A') & (df.Direction == 'UP')].groupby('group').cumcount().add(1))

# Output:
0    1
2    1
3    2
dtype: int64

Do you see any nan values to fill? Is adding .fillna(0) here going to do anything?


A one liner that would work:

df['prev'] = df.assign(prev = df[(df.Event == 'A') & (df.Direction == 'UP')].groupby('group').cumcount().add(1))['prev'].fillna(0)

Upvotes: 2

99_m4n
99_m4n

Reputation: 1265

Because this part df[(df.Event == 'A') & (df.Direction == 'UP')] is filtering only rows for Event A and Direction UP so when you put the fillna(0) at the end, you are only replacing NaN in the filtered subset of rows and the rest will be filled with NaN because the column prev didn't exist prebiously.

Also because the column prev didn't exist prebiously, I think you cannot do this in a single line. What you are doing is create a whole column and modify only a subset of the same column which you would have to break in 2 steps.

Upvotes: 1

DialFrost
DialFrost

Reputation: 1770

I'm not exactly sure why it's not working, but I have a rough idea. In your first idea, this is what is happening:

df['prev'] = df[...]...
df['prev'] = df['prev'].fillna(0)

Your second idea:

df['prev'] = df[...]....fillna(0)

This probably has something to do with placing fillna(0) on the whole dataframe and when transferred over to the new variable (column) prev, it will revert the 0.0 back to NaN.

Upvotes: 0

Related Questions