PV8
PV8

Reputation: 6260

Fill NaN for all strings column with NaN and for all numeric columns with 0

I have a mixed dataframe, where the columns have different types:

df
A   float64
B   object
C   int64

How can I run fillna() that I dont receive the error: TypeError: argument must be a string or number.

I mean how can I split it that all numeric types get filled with 0 (as number value) and all object types with NaN (as string) .

That similiar question: Replace missing values at once in both categorical and numerical columns only answer it for two columns. I am searching for a solution with several columns.

Upvotes: 1

Views: 2310

Answers (1)

jezrael
jezrael

Reputation: 862581

You can create dictionary by columns names with values for replace missing values and pass to DataFrame.fillna:

df= pd.DataFrame(data={'col1': [np.nan,'b','c','d'],
                       'col2': [1,2,np.nan,4],
                       'col3': [np.nan,'b','c','d'],
                       'col4': [1,2,np.nan,4]})

print (df)
  col1  col2 col3  col4
0  NaN   1.0  NaN   1.0
1    b   2.0    b   2.0
2    c   NaN    c   NaN
3    d   4.0    d   4.0

d = {**dict.fromkeys(df.select_dtypes(np.number).columns, 0), 
     **dict.fromkeys(df.select_dtypes(exclude=np.number).columns, 'tmp')}

df = df.fillna(d)
print (df)
  col1  col2 col3  col4
0  tmp   1.0  tmp   1.0
1    b   2.0    b   2.0
2    c   0.0    c   0.0
3    d   4.0    d   4.0

Another idea is replace first numeric and then all another columns:

c = df.select_dtypes(np.number).columns

df[c] = df[c].fillna(0)
df = df.fillna('tmp')
print (df)
  col1  col2 col3  col4
0  tmp   1.0  tmp   1.0
1    b   2.0    b   2.0
2    c   0.0    c   0.0
3    d   4.0    d   4.0

Upvotes: 3

Related Questions