Reputation: 2667
I have several text files of mixed data types in different columns, I want to read them so that the program recognizes each column type automatically as I do not know which column contains which type.
When I read only numeric data I used the following, but it fails for mixed datatypes.
Import numpy as np
Import csv
train = np.array(list(csv.reader(open(self.source_data_file, "rb"), delimiter=','))).astype('float')
Upvotes: 1
Views: 5150
Reputation: 468
Have a look at numpy.genfromtxt
here : http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.genfromtxt.html
You can read files directly by specifying the delimiter and dtype. Suppose you have a line in a csv that goes like this:
10,120.3,xfghfh
You can do the following :
data = np.genfromtxt('input_file', dtype=None , delimiter=",")
print (data)
which will give you this :
data = array((10, 120.3, 'xfghfh'),
dtype=[('column_name1', '<i4'), ('column_name2', '<f8'), ('column_name3', 'S6')])
Upvotes: 4