Sam324
Sam324

Reputation: 309

Removing empty rows from dataframe

I have a dataframe with empty values ​​in rows

enter image description here

How can I remove these empty values? I have already tried data.replace('', np.nan, inplace=True) and data.dropna() but that didn't change anything. What other ways are there to drop empty rows from a dataframe?

Upvotes: 1

Views: 4257

Answers (4)

Sathish Pandurangan
Sathish Pandurangan

Reputation: 38

Try this. Hacky, but works.

data.fillna("").replace('', pd.NA, inplace=True)

Upvotes: 0

Sam
Sam

Reputation: 467

As you have spaces in a numeric variable, I'm assuming it got read in as a string. The way I would solve this in a robust way is following the following:

data = {'lattitude': ['', '38.895118', '', '', '', '45.5234515', '', '40.764462'],
        'longitude': ['', '-77.0363658', '', '', '', '-122.6762071', '', '-11.904565']}
df = pd.DataFrame(data)

enter image description here

Change the fields to a numeric field. errors='coerce' will change the values it can not convert to a numeric to pd.NaN.

df = df.apply(lambda x: pd.to_numeric(x, errors='coerce'))

enter image description here

The only thing you'll have to do now is drop the NA's

df.dropna(inplace=True)

enter image description here

Another possible solution is to use regular expressions. In this case it's a negative match to any character. So if the field does not have a character, it'll be caught here. Of course there are multiple regex possible here.

mask = (df['lattitude'].str.contains(r'(^\S)') & df['longitude'].str.contains(r'(^\S)'))
df = df[mask]

Upvotes: 1

Simon
Simon

Reputation: 421

suppose latitude is between -90 and 90.

data = data[data['latitude'] <= 90]

this should work, no matter they are Nan or ''

Upvotes: 0

BENY
BENY

Reputation: 323226

Try with

data = data.replace('', np.nan).dropna()

Update

data = data.apply(pd.to_numeric,errors='coerce').dropna()

Upvotes: 1

Related Questions