Reputation: 2750
I am new to python and using numpy to read a csv into an array .So I used two methods:
Approach 1
train = np.asarray(np.genfromtxt(open("/Users/mac/train.csv","rb"),delimiter=","))
Approach 2
with open('/Users/mac/train.csv') as csvfile:
rows = csv.reader(csvfile)
for row in rows:
newrow = np.array(row).astype(np.int)
train.append(newrow)
I am not sure what is the difference between these two approaches? What is recommended to use?
I am not concerned which is faster since my data size is small but instead concerned more about differences in the resulting data type.
Upvotes: 0
Views: 4219
Reputation: 1292
You can use pandas also, it is better and simple to use.
import pandas as pd
import numpy as np
dataset = pd.read_csv('file.csv')
# get all headers in csv
values = list(dataset.columns.values)
# get the labels, assuming last row is labels in csv
y = dataset[values[-1:]]
y = np.array(y, dtype='float32')
X = dataset[values[0:-1]]
X = np.array(X, dtype='float32')
Upvotes: 2
Reputation: 231325
So what is the difference in the result?
genfromtxt
is the numpy csv reader. It returns an array. No need for an extra asarray
.
The second expression is incomplete, looks like would produce a list of arrays, one for each line of the file. It uses the generic python csv reader which doesn't do much other than read a line and split it into strings.
Upvotes: 1