Reputation: 259
I'm wondering what I'm doing wrong while applying pd.to_numeric on multiple columns in dataframe
df_weather = pd.read_csv ('https://raw.githubusercontent.com/MichalLeh/Edinburgh-bikes-project/main/edinburgh_weather.csv')#("J:/edinburgh_weather.csv")
Sample of dataframe:
time temp feels wind gust rain humidity cloud pressure vis date
0 00:00 11 °c 11 °c 9 km/h from S 19 km/h 0.0 mm 79% 13% 1020 mb Excellent 2018-09-01
First I get rid of unwanted characters:
df_weather = (df_weather[['time', 'date', 'temp', 'feels', 'wind', 'gust', 'rain', 'humidity', 'cloud', 'pressure']]
.replace(to_replace ='[^0-9\:\-\.]', value = '', regex = True))
And then I apply to_numeric:
df_weather[['temp', 'feels', 'wind', 'gust', 'rain', 'humidity', 'cloud', 'pressure']].apply(lambda x: pd.to_numeric(x, errors='coerce'))
df_weather.info()
I'm not getting any errors and yet the result looks like this:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6336 entries, 0 to 6335
Data columns (total 11 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 time 6336 non-null object
1 temp 6336 non-null object
2 feels 6336 non-null object
3 wind 6336 non-null object
4 gust 6336 non-null object
5 rain 6336 non-null object
6 humidity 6336 non-null object
7 cloud 6336 non-null object
8 pressure 6336 non-null object
9 vis 6336 non-null object
10 date 6336 non-null object
dtypes: object(11)
memory usage: 544.6+ KB
BTW pd.to_numeric
works when I transform given columns one by one though. I'd love to be able to convert given data at same time. Thank you.
Upvotes: 1
Views: 1400
Reputation: 862511
You need assign back columns converted to numeric:
cols = ['temp', 'feels', 'wind', 'gust', 'rain', 'humidity', 'cloud', 'pressure']
df_weather[cols] = df_weather[cols].apply(lambda x: pd.to_numeric(x, errors='coerce'))
Upvotes: 2