Reputation: 101
I am trying to fill the missing values in the data frame, but all of the values were replaced with None
.
Here is the example I have tried:
# Basic libraries
import os
import pandas as pd
import numpy as np
# Visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns
import folium
#import folium.plugins as plugins
from wordcloud import WordCloud
import plotly.express as px
data_dict = {'First':[100, 90, np.nan, 95],
'Second': [30, 45, 56, np.nan],
'Third':[np.nan, 40, 80, 98]}
#reating a dataframe from list
df1 = pd.DataFrame(data_dict)
#first_try_with_column_name
df1.loc[:,'First'] = df1.loc[:,'First'].fillna(method='ffill', inplace=True)
#Second_try_Using_List_of_Columns
list_columns = ['First','Second','Third']
df1.loc[:,list_columns] = df1.loc[:,list_columns].fillna(value, inplace=True)
df1
As shown, I used multiple ways to understand the reason behind this issue, so I tried to use the column name, and then I used a list of column names, but unfortunately, the issue is the same.
Is there any recommendation, please?
Upvotes: 2
Views: 108
Reputation: 379
change
df1.loc[:,'First'] = df1.loc[:,'First'].fillna(method='ffill', inplace=True)
to
df1.loc[:,'First'].fillna(method='ffill', inplace=True)
this is because you are using inplace=True which means changes will be made to the original dataframe.
As for the None values, they come from the function returning None as it's inplace and there is nothing to return. Hence, all the values become None.
For each column,
for col in df1.columns:
df1[col].fillna(10, inplace=True)
df1
PS: For the future user, -- avoid inplace because In pandas, is inplace = True considered harmful, or not?
Upvotes: 2
Reputation: 12808
If you want to forward fill you can just do:
df1 = df1.ffill()
This results in:
First Second Third
0 100.0 30.0 NaN
1 90.0 45.0 40.0
2 90.0 56.0 80.0
3 95.0 56.0 98.0
There's still one nan value, so we could do a backfill still:
df1 = df1.bfill()
Final result:
First Second Third
0 100.0 30.0 40.0
1 90.0 45.0 40.0
2 90.0 56.0 80.0
3 95.0 56.0 98.0
If you only want to forward fill na's in specific columns, then use the following. Please note I am NOT using inplace=True
. This was the reason why you're code wasn't working before.
columns_to_fillna = ['Second', 'Third']
df1.loc[:, columns_to_fillna] = df1.loc[:, columns_to_fillna].ffill()
If you really want to use inplace=True
, which is not be advised, then do:
columns_to_fillna = ['Second', 'Third']
df1.loc[:, columns_to_fillna].ffill(inplace=True)
Reason why inplace is not advised, is discussed here:
https://stackoverflow.com/a/60020384/6366770
Upvotes: 2