Reputation: 445
Busy looking into the limits of loadtxt specifically. I have a multi-dimensional array:
# Sample header for python loadtxt
Very random text:¤mixed with¤strings¤numbers
300057¤9989¤34956¤1
110087¤9189¤24466¤4
# EOF
I can read this all in as a string (unknown length) and then convert to integers and floats later. This I have here:
import numpy as np
txtdata = np.loadtxt('Mytxtfile.txt',delimiter=chr(164),comments="#",dtype='str')
However I would like to know if it is possible to extract, directly into a multidimensional array. Such as:
>>>
[['Very random text:','mixed with','strings','numbers']
[300057,9989,34956,1]
[110087, 9189, 24466, 4]]
I tried this dtype command with no success:
dtype=[('a', 'str'),('b','int'),('c','int')]
Upvotes: 1
Views: 7362
Reputation: 879103
txtdata = np.loadtxt(
'Mytxtfile.txt', delimiter=chr(164), comments="#", skiprows=1,
dtype=[('a', '|S6'), ('b', '<i4'), ('c', '<i4'), ('d', '<i4')])
Your sample data shows 4 columns, so to specify the dtype
explicitly, you would need something like:
dtype=[('a', '|S6'), ('b', '<i4'), ('c', '<i4'), ('d', '<i4')]
Note that NumPy does not have a variable-width 'str'
dtype. You have to specify the number of bytes in advance. For example, '|S6'
specifies a 6-byte string dtype.
If you do not know in advance how many bytes may be in the string column(s), then it may be more convenient to use numpy.genfromtxt:
txtdata = np.genfromtxt('Mytxtfile.txt', delimiter=chr(164), comments="#",
names=True, dtype=None)
dtype=None
tells genfromtxt
to make an intelligent guess for the dtype.
Upvotes: 2