Reputation: 983
I am trying to load a .csv file into an array. However, the file looks something like this.
"myfilename",0.034353453,-1.234556,-3,45671234
,1.43567896, -1.45322124, 9.543422
.................................
.................................
I am trying to skip the leading string. I've been doing away with the first row till now.
a = np.genfromtxt(file,delimiter=',',skiprows=1)
But I was wondering if there's a way to read into an array ignoring the string at the beginning in processing.
Upvotes: 1
Views: 1589
Reputation: 430
Can you just use loadtxt(..., usecols=(1,2,3), ...)
, which avoids skipping a line at the start of the file?
The usecols argument just tells loadtxt which columns to extract (and are numeric)
# Put data into file (in shell, just me copying the sample)
cat >> /tmp/data.csv
"myfilename",0.034353453,-1.234556,-3,45671234
,1.43567896, -1.45322124, 9.543422
# In IPython
In [1]: import numpy as np
In [2]: a = np.loadtxt('/tmp/data.csv', usecols=(1,2,3), delimiter=',')
In [3]: a
Out[3]:
array([[ 0.03435345, -1.234556 , -3. ],
[ 1.43567896, -1.45322124, 9.543422 ]])
Upvotes: 2
Reputation: 309899
since it's just the first line at the beginning of the file, you could write a helper generator to remove that string for now:
def helper(filename):
with open(filename) as fin:
# this could get more robust ... e.g. by doing typechecking if necessary.
line = next(fin).split(',')
yield ','.join(line[1:])
for line in fin:
yield line
arr = np.genfromtxt(helper('myfile.csv'), delimiter=',')
Upvotes: 0