Reputation: 1356
apparently, there is a column in my pandas DF - ncData - that is not numeric because calcuations that I'm tryting to do give me an error like "no numeric data to aggregate". So, I have seen the documentation https://appdividend.com/2020/06/18/pandas-to_numeric-function-in-python-example/ and have tried some things but nothing is working yet. My data looks like this and it is a panda DF--
plant_name business_name maint_region_name wind_speed_ms dataset year month day hour
0 RIO DO FOGO BRAZIL BRAZIL 7.88 ERA5 2021 5 31 21
1 RIO DO FOGO BRAZIL BRAZIL 7.95 ERA5 2021 5 31 20
2 RIO DO FOGO BRAZIL BRAZIL 7.72 ERA5 2021 5 31 19
The column "wind_speed_ms" apparently is not numeric. How do i change this column to numeric? I'v tried this but how do i assign the results or the conversion back to the original df?
pd.to_numeric(ncData.wind_speed_ms, errors='raise', downcast=None)
should i asssign it a new name for the output and then re-insert it back into "ncData"? thank you,
Upvotes: 1
Views: 354
Reputation: 643
One common reason for pandas not recognizing columns as numbers when you expect it to be a number is the result of number formatting. For example, having commas in a series of numbers might cause pandas to recognize the series as an object instead of a number.
If your data set isn't toooo large, one way to look for characters that might be causing your series to be registered as a object is to use
df['wind_speed_ms'].unique().tolist()
and then scanning the list for commas, parens, etc.
Once you identify the character that is causing issues, you can replace it for a valid character or remove it all together.
You could so something like this:
df['wind_speed_ms'] = df['wind_speed_ms'].apply(lambda x: x.replace(',', '.'))
Also, btw: I am the creator of a python packaged called mitosheet, a spreadsheet extension to Jupyter that converts all of your edits into valid python code. I actually just implemented a feature that lets you convert the type of your columns with the click of a button.
Upvotes: 1