Reputation: 327
I have a csv file and I have to compute the mean for some of the columns. That's how I did:
file = csv.reader(open('tab.csv','r'))
n = []
for row in file:
n.append(row[8])
So I have a list of string: n = ['', '', '1.58', ...]
.
How can I convert these to float?
I tried with:
n_values = np.array(n)
n_values[n == ''] = '0'
values = n_values.astype(np.float)
np.mean(values)
But the mean is not correct because I should skip the empty strings not counting.
Upvotes: 18
Views: 51006
Reputation: 400
Just add quoting=csv.QUOTE_NONNUMERIC
:
Instructs
reader
objects to convert all non-quoted fields to type float.
with open('tab.csv', newline='') as file:
reader = csv.reader(file, quoting=csv.QUOTE_NONNUMERIC)
n = []
for row in reader:
n.append(row[8])
Upvotes: 20
Reputation: 180411
Just cast as you append:
n.append(float(row[8]))
If there are empty strings catch those before appending.
try:
n.append(float(row[8]))
except ValueError:
continue
Or you might want to try pandas, in particular pandas.read_csv:
import pandas as pd
df = pd.read_csv("in.csv")
print(df["col_name"].mean())
Upvotes: 17