Reputation: 21
I was learning to use the dropna() function in Python, in order to drop rows/columns which contained NaN/'?' values in them. However, even after seeing various solutions online, I couldn't drop data in spite of getting no syntactical errors.
I've tried the following solutions:
First Attempt
df1 = df.dropna()
df1
Continued
df1.dropna(inplace=1)
df1
The first part of the code gave me the original data frame
The second part gave me the following error:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 df1.dropna(inplace=1) 2 3 df1
~\Anaconda3\lib\site-packages\pandas\core\frame.py in dropna(self, axis, how, thresh, subset, inplace) 4259 1 Batman Batmobile 1940-04-25 4260 """ -> 4261 inplace = validate_bool_kwarg(inplace, 'inplace') 4262 if isinstance(axis, (tuple, list)): 4263 # GH20987
~\Anaconda3\lib\site-packages\pandas\util_validators.py in validate_bool_kwarg(value, arg_name) 224 raise ValueError('For argument "{arg}" expected type bool, received ' 225 'type {typ}.'.format(arg=arg_name, --> 226 typ=type(value).name)) 227 return value 228
ValueError: For argument "inplace" expected type bool, received type
Further, is there any better alternatives to dropna() function?
EDIT 1
NameError: name 'df1' is not defined
PS All the errors and issues are visible in the code
LINK TO THE CSV FILE USED = CSV
Upvotes: 0
Views: 3125
Reputation: 1
You should also add inplace = True
to the replace function
df.replace("?", np.nan, inplace = True)
Upvotes: 0
Reputation: 26
Firstly replace ? with nan, like this:
df.replace('?', np.nan)
Then drop all the missing values using dropna (the nan's you just replaced above, like this:
df1 = df.dropna()
df1
and then use inplace
to keep the DataFrame with valid entries in the same variable, like this:
df1.dropna(inplace=True)
df1
Upvotes: 1