Reputation: 81
import numpy as np
import pandas as pd
class DataProcessing:
def __init__(self, df=None, file=None, duplicates=None, uninformative=None, mhealth_dataset=None):
self.df = df
self.file = file
self.duplicates = duplicates
self.uninformative = uninformative
self.mhealth_dataset = mhealth_dataset
def data(self):
arrays = [np.loadtxt(self.file, dtype=str, delimiter="/t")]
matrices = np.concatenate(arrays)
self.df = np.array(list(matrices)).reshape(len(arrays), 2)
return self.df
def data_cleaning(self):
# Drop and impute missing values
df = pd.fillna(statistics.mean(self.df), inplace=True)
return df
dp = DataProcessing()
dc = dp.data_cleaning()
Traceback error:
Traceback (most recent call last): File "C:\Users\User\PycharmProjects\algorithms\project_kmeans.py", line 46, in dc = dp.data_cleaning() File "C:\Users\User\PycharmProjects\algorithms\project_kmeans.py", line 26, in data_cleaning df = pd.fillna(statistics.mean(self.df), inplace=True) File "C:\Users\User\PycharmProjects\algorithms\venv\lib\site-packages\pandas_init_.py", line 244, in getattr raise AttributeError(f"module 'pandas' has no attribute '{name}'") AttributeError: module 'pandas' has no attribute 'fillna'
Upvotes: 1
Views: 2080
Reputation: 806
fillna()
is a method on pandas DataFrame or Series, you probably want to change data_cleaning() implementation as follows:
def data_cleaning(self):
# Drop and impute missing values
df = statistics.mean(self.df.fillna(...))
return df
and specify value or method to use for filling na's in the dataframe.
Upvotes: 3