Reputation: 65
I'm trying to sum some values in a list so i loaded the .dat file that contains the values, but the only way Python makes the sum of the data is by separate it with ','. Now, this is what I get.
altura = np.loadtxt("bio.dat",delimiter=',',usecols=(5,),dtype='float')
File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 846, in loadtxt
vals = [vals[i] for i in usecols]
IndexError: list index out of range
This is my code
import numpy as np
altura = np.loadtxt("bio.dat",delimiter=',',usecols=(5,),dtype='str')
print altura
And this is the file 'bio.dat'
1 Katherine Oquendo M 18 1.50 50
2 Pablo Restrepo H 20 1.83 79
3 Ana Agudelo M 18 1.58 45
4 Kevin Vargas H 20 1.74 80
5 Pablo madrid H 20 1.70 55
What I intend to do is
x=sum(altura)
What should i do with the 'separate'?
Upvotes: 0
Views: 4150
Reputation: 5263
In my case, some line includes #
character.
Then numpy will ignore all the rests of the line, because that means ‘comment’.
So try again with comments
parameter like
altura = np.loadtxt("bio.dat",delimiter=',',usecols=(5,),dtype=‘str’,comments=‘')
And I recommend you not to use np.loadtxt
. Because it’s incredibly slow if you must process a large(>1M lines) file.
Upvotes: 1
Reputation: 231325
The file doesn't need to be comma separated. Here's my sample run, using StringIO
to simulate a file. I assume you want to sum the numbers that look a person's height (in meters).
In [17]: from StringIO import StringIO
In [18]: s="""\
1 Katherine Oquendo M 18 1.50 50
2 Pablo Restrepo H 20 1.83 79
3 Ana Agudelo M 18 1.58 45
4 Kevin Vargas H 20 1.74 80
5 Pablo madrid H 20 1.70 55
"""
In [19]: S=StringIO(s)
In [20]: data=np.loadtxt(S,dtype=float,usecols=(5,))
In [21]: data
Out[21]: array([ 1.5 , 1.83, 1.58, 1.74, 1.7 ])
In [22]: np.sum(data)
Out[22]: 8.3499999999999996
as script (with the data in a .txt file)
import numpy as np
fname = 'stack25828405.txt'
data=np.loadtxt(fname,dtype=float,usecols=(5,))
print data
print np.sum(data)
2119:~/mypy$ python2.7 stack25828405.py
[ 1.5 1.83 1.58 1.74 1.7 ]
8.35
Upvotes: 0
Reputation: 560
Alternatively, you can convert the tab delimited file to csv first.
csv
supports tab delimited files. Supply the delimiter
argument to reader
:
import csv
txt_file = r"mytxt.txt"
csv_file = r"mycsv.csv"
# use 'with' if the program isn't going to immediately terminate
# so you don't leave files open
# the 'b' is necessary on Windows
# it prevents \x1a, Ctrl-z, from ending the stream prematurely
# and also stops Python converting to / from different line terminators
# On other platforms, it has no effect
in_txt = csv.reader(open(txt_file, "rb"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'wb'))
out_csv.writerows(in_txt)
This answer is not my work; it is the work of agf found at https://stackoverflow.com/a/10220428/3767980.
Upvotes: 0