Ricky
Ricky

Reputation: 2750

numpy reading a csv file to an numpy array

I am new to python and using numpy to read a csv into an array .So I used two methods:

Approach 1

train = np.asarray(np.genfromtxt(open("/Users/mac/train.csv","rb"),delimiter=","))

Approach 2

with open('/Users/mac/train.csv') as csvfile:
        rows = csv.reader(csvfile)
        for row in rows:
            newrow = np.array(row).astype(np.int)
            train.append(newrow)

I am not sure what is the difference between these two approaches? What is recommended to use?

I am not concerned which is faster since my data size is small but instead concerned more about differences in the resulting data type.

Upvotes: 0

Views: 4219

Answers (2)

Rachit Tayal
Rachit Tayal

Reputation: 1292

You can use pandas also, it is better and simple to use.

import pandas as pd
import numpy as np

dataset = pd.read_csv('file.csv')
# get all headers in csv
values = list(dataset.columns.values)

# get the labels, assuming last row is labels in csv
y = dataset[values[-1:]]
y = np.array(y, dtype='float32')
X = dataset[values[0:-1]]
X = np.array(X, dtype='float32')

Upvotes: 2

hpaulj
hpaulj

Reputation: 231325

So what is the difference in the result?

genfromtxt is the numpy csv reader. It returns an array. No need for an extra asarray.

The second expression is incomplete, looks like would produce a list of arrays, one for each line of the file. It uses the generic python csv reader which doesn't do much other than read a line and split it into strings.

Upvotes: 1

Related Questions