Reputation: 3801
I am trying to convert specific columns in my DataFrame to dtype: float. I tried this:
grid[['DISTINCT_COUNT','MAX_COL_LENGTH', 'MIN_COL_LENGTH', 'NULL_COUNT' ]].apply(pd.to_numeric, errors='ignore')
But when I print this afterwards:
print(grid.dtypes)
I am still seeing this:
COLUMN_NM object
DISTINCT_COUNT object
NULL_COUNT object
MAX_COL_VALUE object
MIN_COL_VALUE object
MAX_COL_LENGTH object
MIN_COL_LENGTH object
TABLE_CNT object
TABLE_NM object
DATA_SOURCE object
dtype: object
Any ideas?
Upvotes: 0
Views: 382
Reputation: 43504
Using apply()
does not modify the DataFrame in place. You have to assign the output of the operation back to the original DataFrame.
@coldspeed's answer here explains what's going on here:
All these slicing/indexing operations create views/copies of the original dataframe and you then reassign
df
to these views/copies, meaning the originals are not touched at all.
In your case, you need to do:
columns = ['DISTINCT_COUNT','MAX_COL_LENGTH', 'MIN_COL_LENGTH', 'NULL_COUNT']
grid[columns] = grid[columns].apply(pd.to_numeric, errors='ignore')
Or you could also do:
grid[columns] = pd.to_numeric(grid[columns], errors='ignore')
Upvotes: 2