Reputation: 173
I have a large excel data sheet that I import using pandas. I need to calculate things like standard deviation etc. When I convert the file to numpy, it also imports the string values. Is there a way for the numpy array to only have float values?
import pandas as pd
import numpy as ny
a = pd.read_excel('Prior Example.xlsm', 'Security Levels Raw')
c = a.to_numpy()
d = ny.std(c)
Upvotes: 0
Views: 179
Reputation: 59549
You can use the converters
argument (also exists for pd.read_excel
). Though really I'd just convert afterwards:
test.csv
number1,number2
1,foo
2,bar
3,4
1,4
import pandas as pd
def convert_numbers(s):
return pd.to_numeric(s, errors='coerce')
df = pd.read_csv('test.csv', converters={'number2': convert_numbers})
display(df)
df.dtypes
# number1 number2
#0 1 NaN
#1 2 NaN
#2 3 4.0
#3 1 4.0
#number1 int64
#number2 float64
#dtype: object
Upvotes: 2