kms
kms

Reputation: 2024

Replace numpy arrays in pandas series

I have a pandas DataFrame with a combination of numpy.ndarray and categorical variables.

import pandas as pd
import numpy as np

df = pd.DataFrame({
                   'i': [0, 1, 2],
                   'a': [np.array([]), np.array([]), 'A']
                 })

df['a'][0]

array([], dtype=float64)

I'd like to replace the arrays with np.nan.

I tried: df['a'].replace(np.array([]), np.nan) but it didn't work.

Upvotes: 0

Views: 154

Answers (2)

Corralien
Corralien

Reputation: 120429

If you want to replace empty arrays by nan, you can convert the column a as boolean mask:

df['a'] = df['a'].where(df['a'].astype(bool), np.nan)
print(df)

# Output
   i    a
0  0  NaN
1  1  NaN
2  2    A

Upvotes: 2

Rabinzel
Rabinzel

Reputation: 7923

One way could be checking with isinstance:

# on one column
df['a'] = df['a'].apply(lambda x: np.nan if isinstance(x, np.ndarray) else x)

#whole df
df = df.applymap(lambda x: np.nan if isinstance(x, np.ndarray) else x)

Upvotes: 1

Related Questions