Reputation: 433
Let's assume the following pandas
dataframe:
df =
Column1 Column2 Column3 Column4
2007-01-02 1M String1 String2
2007-01-02 1M 0.051695 0.0057984
2007-01-02 1M 0.0498056 0.00725827
2007-01-02 1M 0.0493161 0.00780772
2007-01-02 1M 0.0492764 0.00810296
2007-01-02 1M 0.0493988 0.00820139
2007-01-02 1M 0.0495177 0.00829837
2007-01-03 1M String1 String2
2007-01-03 1M 0.0516506 0.00589057
2007-01-03 1M 0.0496136 0.00726748
2007-01-03 1M 0.0490747 0.00781708
2007-01-03 1M 0.0490845 0.0081065
2007-01-03 1M 0.0492069 0.00820219
2007-01-04 1M String1 String2
2007-01-04 1M 0.0510632 0.00589493
... ... ... ...
Columns Column3
and Column4
are considered as objects. When I check the type
of the second element of Column3
(i.e. 0.051695
) I get float
. My question is if I can change the numerical elements of Column3
and Column4
from float
to float64
. I tried the following but it didn't work:
df[["Column3"]][df["Column3"]!="String1"] =
df[["Column3"]][df["Column3"]!="String1"].astype(np.float64)
which gave me the error
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
df.loc[:,"Column3"][df["Column3"]!="String1"] =
df.loc[:,"Column3"][df["Column3"]!="String1"].astype(np.float64)
which gave me the error
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
Upvotes: 0
Views: 664
Reputation: 862641
Your solution is possible change with pass mask to loc
function:
mask = df["Column3"]!="String1"
df.loc[mask,"Column3"] = df.loc[mask,"Column3"].astype(np.float64)
Another idea if convert all possible numeric, non numeric generate missing values, so replaced by original:
df['Column3'] = pd.to_numeric(df['Column3'], errors='coerce').fillna(df['Column3'])
For test is posible use checking type
:
print (df['Column3'].apply(type))
0 <class 'str'>
1 <class 'float'>
2 <class 'float'>
3 <class 'float'>
4 <class 'float'>
5 <class 'float'>
6 <class 'float'>
7 <class 'str'>
8 <class 'float'>
9 <class 'float'>
10 <class 'float'>
11 <class 'float'>
12 <class 'float'>
13 <class 'str'>
14 <class 'float'>
Name: Column3, dtype: object
In my opinion still best should be only numeric column with missing values for non match values:
df['Column3'] = pd.to_numeric(df['Column3'], errors='coerce')
Upvotes: 1