Whitebeard13
Whitebeard13

Reputation: 433

Python Pandas: Change type from float to float64 in a column which contains both numerical and string elements

Let's assume the following pandas dataframe:

df = 
       Column1       Column2       Column3     Column4
       2007-01-02    1M            String1     String2 
       2007-01-02    1M            0.051695    0.0057984
       2007-01-02    1M            0.0498056   0.00725827
       2007-01-02    1M            0.0493161   0.00780772
       2007-01-02    1M            0.0492764   0.00810296
       2007-01-02    1M            0.0493988   0.00820139
       2007-01-02    1M            0.0495177   0.00829837
       2007-01-03    1M            String1     String2 
       2007-01-03    1M            0.0516506  0.00589057
       2007-01-03    1M            0.0496136  0.00726748
       2007-01-03    1M            0.0490747  0.00781708
       2007-01-03    1M            0.0490845   0.0081065
       2007-01-03    1M            0.0492069  0.00820219
       2007-01-04    1M            String1    String2     
       2007-01-04    1M            0.0510632  0.00589493
              ...    ...           ...        ...

Columns Column3 and Column4 are considered as objects. When I check the type of the second element of Column3 (i.e. 0.051695) I get float. My question is if I can change the numerical elements of Column3 and Column4 from float to float64. I tried the following but it didn't work:

df[["Column3"]][df["Column3"]!="String1"] = 
df[["Column3"]][df["Column3"]!="String1"].astype(np.float64) 

which gave me the error

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

df.loc[:,"Column3"][df["Column3"]!="String1"] = 
df.loc[:,"Column3"][df["Column3"]!="String1"].astype(np.float64) 

which gave me the error

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

Upvotes: 0

Views: 664

Answers (1)

jezrael
jezrael

Reputation: 862641

Your solution is possible change with pass mask to loc function:

mask = df["Column3"]!="String1"
df.loc[mask,"Column3"] = df.loc[mask,"Column3"].astype(np.float64)

Another idea if convert all possible numeric, non numeric generate missing values, so replaced by original:

df['Column3'] = pd.to_numeric(df['Column3'], errors='coerce').fillna(df['Column3'])

For test is posible use checking type:

print (df['Column3'].apply(type))
0       <class 'str'>
1     <class 'float'>
2     <class 'float'>
3     <class 'float'>
4     <class 'float'>
5     <class 'float'>
6     <class 'float'>
7       <class 'str'>
8     <class 'float'>
9     <class 'float'>
10    <class 'float'>
11    <class 'float'>
12    <class 'float'>
13      <class 'str'>
14    <class 'float'>
Name: Column3, dtype: object

In my opinion still best should be only numeric column with missing values for non match values:

df['Column3'] = pd.to_numeric(df['Column3'], errors='coerce')

Upvotes: 1

Related Questions