Austin Wolff
Austin Wolff

Reputation: 635

Converting Pandas column data with np.where() not working as it should

I am using np.where() in combination with a Pandas DataFrame to keep the column as-is if it contains the phrase "no restriction", and make it a float if not. Here is my code:

main_df[col] = np.where(main_df[col].str.contains('no restriction', case=False, na=False, regex=False),
                        main_df[col],
                        main_df[col].apply(lambda x: float(x)))

Here is the error I am getting on a cell that contains the string "No restriction":

    140     main_df[col] = np.where(main_df[col].str.contains('no restriction', case=False, na=False, regex=False),    
    141                             main_df[col],
--> 142                             main_df[col].apply(lambda x: float(x)))


ValueError: could not convert string to float: 'No restriction'

It looks like the Series.str.contains() isn't detecting a cell that contains the string 'No restriction'. What am I doing wrong?

Upvotes: 0

Views: 345

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150825

The problem is that main_df[col].apply(lambda x: float(x)) still converts the whole series, including 'no restriction', which obviously fails and throws that error. You can use pd.to_numeric with errors='coerce' option:

main_df[col] = pd.to_numeric(main_df[col], errors='coerce').fillna(main_df[col])

The question is why, though? You should not be mixing float with str in a column.

Upvotes: 1

Related Questions