Reputation: 123
In the below notebook , after imputing the missing values using SimpleImputer, the dataframe was converted to a numpy array, how do I make sure that it's type remains as a dataframe itself ?
import pandas as pd
df1 = pd.read_excel("dummy.xlsx")
imp = SimpleImputer(strategy='median')
df2=imp.fit_transform(df2)
df2
Upvotes: 1
Views: 2226
Reputation: 3711
The documentation of sklearn.impute.SimpleImputer.fit_transform
says clearly that it will return a numpy.array
:
Returns:
X_newnumpy
: array of shape[n_samples, n_features_new]
Transformed array.
So you cannot "make sure that it's type remains as a dataframe". However, you can of course feed the resulting numpy.array
in the pandas.DataFrame()
constructor
from sklearn.impute import SimpleImputer
import pandas as pd
import numpy as np
# Mocking your data
df = pd.DataFrame(np.random.rand(10,3))
df[df > 0.9] = np.nan
imp = SimpleImputer(strategy='median')
# Feeding resulting numpy array from fit_transform directly to new df2
df2 = pd.DataFrame(imp.fit_transform(df))
That's it
>>> type(df2)
pandas.core.frame.DataFrame
Upvotes: 2