aurora
aurora

Reputation: 77

Assignment in .loc loop in Pandas

Suppose we have the following dataframe.

df = pd.DataFrame([['1','2'], ['3', '4']])

If we look at dtypes, all the numbers are objects:

df.dtypes
0    object
1    object

I want to have them as int or float. So if I do

df_apply = df.apply(pd.to_numeric)

I get

df_apply.dtypes
0    int64
1    int64
dtype: object

Same if I do

df_astype_int = df.astype(int)
df_astype_int.dtypes
0    int64
1    int64

Or

df_astype_float = df.astype(float)
df_astype_float.dtypes
0    float64
1    float64
dtype: object

Or

df_loop = df.copy()
for i in df_loop:
    df_loop[i] = pd.to_numeric(df_loop[i])    
df_loop.dtypes
0    int64
1    int64

But if I do

df_loop = df.copy()
for index in df_loop.index:
    df_loop.loc[index,:] = pd.to_numeric(df_loop.loc[index,:])
df_loop.dtypes
0    object
1    object
dtype: object

I get objects again. Why is that? It doesn't raise any errors, but doesn't convert either.

Upvotes: 1

Views: 252

Answers (1)

Henry Yik
Henry Yik

Reputation: 22493

In the below:

for index in df_loop.index:
    df_loop.loc[index,:] = pd.to_numeric(df_loop.loc[index,:])

You are trying to assign a row of Series with dtype int. The result got upcasted to object which is the common NumPy dtype of all types column-wise, since pandas is column based. For added info, see numpy.find_common_type.

But when you convert the dtype of the whole DataFrame or a column, it would work fine just like the other samples you provided.

Upvotes: 1

Related Questions