Reputation: 3619
I have a file that looks like this, it has around 80,000 lines:
-1.1361818e-001 4.1730759e-002 -9.8787775e-001 9.7195663e-002
-1.1361818e-001 4.1730759e-002 -9.8787775e-001 9.7195663e-002
-1.1361818e-001 4.1730759e-002 -9.8787775e-001 9.7195663e-002
-1.1361818e-001 4.1730759e-002 -9.8787775e-001 9.7195663e-002
I'd like to work with numpy and scikit, and would like to write the file into an array so that it looks like this:
array = [[-1.1361818e-001,4.1730759e-002,-9.8787775e-001,9.7195663e-002],[-1.1361818e-001 ,4.1730759e-002,-9.8787775e-001,9.7195663e-002]...]
I found the following example at https://stackoverflow.com/a/10938021/1372560
I tried to adapt it to my example:
import numpy as np
a = np.loadtxt("/path2file", delimiter="\t")
print a
And I get the error "ValueError: invalid literal for float(): -1.1361818e-001 4.1730759e-002 -9.8787775e-001 9.7195663e-002"
I'm really stuck here and appreciate your help. Thanks a lot in advance!
Upvotes: 0
Views: 462
Reputation: 251186
Simply leave the delimiter
field empty, then it'll split at any whitespace. \t
is a whitespace character only.:
Demo:
>>> import numpy as np
>>> from StringIO import StringIO
>>> c = StringIO("1.234\t1.23456 1.234234")
>>> np.loadtxt(c)
array([ 1.234 , 1.23456 , 1.234234])
From docs:
delimiter : str, optional
The string used to separate values. By default, this is any whitespace.
Upvotes: 1
Reputation: 13423
That works for me :
import numpy as np
a = np.loadtxt("a.txt")
print a
Output:
[[-0.11361818 0.04173076 -0.98787775 0.09719566]
[-0.11361818 0.04173076 -0.98787775 0.09719566]
[-0.11361818 0.04173076 -0.98787775 0.09719566]
[-0.11361818 0.04173076 -0.98787775 0.09719566]]
Upvotes: 3