Dilyana Flower
Dilyana Flower

Reputation: 125

Why doesn't dropna seem to work on this column?

I try to drop all na values in one column Filmname, but the values don't get dropped. Why? (screenshot of my result)

enter image description here

Here is my code:

import pandas as pd
df = read.csv....

df.dropna(subset=['Filmname'], inplace=True)
df.head()

Upvotes: 1

Views: 1357

Answers (2)

jpp
jpp

Reputation: 164803

By default, "na" is not considered NaN by pandas.read_csv.

You can add this as a NaN string manually via the na_values argument:

df = pd.read_csv('file.csv', na_values=['na'])

As per the docs:

na_values : scalar, str, list-like, or dict, default None

Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values. By default the following values are interpreted as NaN: ‘’, ‘#N/A’, ‘#N/A N/A’, ‘#NA’, ‘-1.#IND’, ‘-1.#QNAN’, ‘-NaN’, ‘-nan’, ‘1.#IND’, ‘1.#QNAN’, ‘N/A’, ‘NA’, ‘NULL’, ‘NaN’, ‘n/a’, ‘nan’, ‘null’.

Upvotes: 3

Shlomita
Shlomita

Reputation: 63

It looks like what your values are in this screenshot is not "NaN" or some real error, but a parsed string of the value "na".

In order to filter out the rows with this value in this column, you can use to simply refer to the df with a condition, instead of using dropna:

df = pd.read_csv(...)
filtered_df = df[df['Filmname'] != 'na']

The condition inside may be anything, see this guide for a start

Upvotes: 1

Related Questions