neo-technoker
neo-technoker

Reputation: 389

How replace all NULL values in Pandas

Context

A CSV export from the MS SQL Server has "NULL" as value across various columns randomly

Expected Outcome

Replace the "NULL"s with None as the data is multi data-typed This is an intermediate step before I selectively replace None to 0, 'Uknown', etc depending the data type of the column

Tried

  1. df[['Col1', 'CCol1']].replace(to_replace = ['NULL', ' '], value=None, inplace = True) seems to remove the rows entirely
  2. df[['Col1', 'CCol1']].replace(to_replace = ['NULL', ' '], value='------', inplace = True) seems at least do the replace function
  3. df.where(df['Col1'].map(lambda x: str(x) == 'NULL'), inplace=True) seems to Nan all the values in rows that meet the criteria

Limitation

df[['Col1', 'CCol1']] is already limiting as I would ideally like to replace NULL in all columns with None

Other material referred to

Replacing few values in a pandas dataframe column with another value

Is there an effective way to replace NULLs to None across all columns and rows?

Upvotes: 0

Views: 6181

Answers (1)

Teng Sin Yong
Teng Sin Yong

Reputation: 61

I think this is a nice way to deal with the problem:

import pandas as pd
#An example dataframe
df = pd.DataFrame([[np.nan, 2, "NULL", 0],
                   [3, 4, np.nan, 1],
                   [np.nan, np.nan, "NULL", 5],
                   [np.nan, 3, np.nan, 4]],
                  columns=list('ABCD'))
#To remove all non-numeric such as None,NAN, etc. and change them to np.nan
for c in df.columns:
    df[c] = pd.to_numeric(df[c], errors='coerce')
#Change np.nan to whatever you want. For example, NONE:
df=df.fillna(value='NONE')
print(df)

>>>      A     B     C  D
0  NONE     2  NONE  0
1     3     4  NONE  1
2  NONE  NONE  NONE  5
3  NONE     3  NONE  4

Upvotes: 1

Related Questions