Reputation: 229
I'm trying to assign column names using np.dtype
I have defined a list of names
print fieldNameList
[u'A', u'B', u'C', u'D', u'E', u'F', u'G', u'H', u'I', u'J', u'K', u'L', u'M', u'N', u'S']
Then, array to string
field_name = ', '.join(["('%s', '<f8')" % w for w in fieldNameList])
print field_name
('A', '<f8'), ('B', '<f8'), ('C', '<f8'), ('D', '<f8'), ('E', '<f8'), ('F', '<f8'), ('G', '<f8'), ('H', '<f8'), ('I', '<f8'), ('J', '<f8'), ('K', '<f8'), ('L', '<f8'), ('M', '<f8'), ('N', '<f8'), ('S', '<f8')
Then
inarray = np.array(tup1,
np.dtype([field_name]))
I get an error
np.dtype([field_name]))
TypeError: data type not understood
When instead of a variable enter generated field_name get the desired result
inarray = np.array(tup1,
np.dtype([('A', '<f8'), ('B', '<f8'), ('C', '<f8'), ('D', '<f8'), ('E', '<f8'), ('F', '<f8'), ('G', '<f8'), ('H', '<f8'), ('I', '<f8'), ('J', '<f8'), ('K', '<f8'), ('L', '<f8'), ('M', '<f8'), ('N', '<f8'), ('S', '<f8')]))
The number and names of columns depend on the input table. It defines user. Why can not the number and names of columns defined in the script.
Does anyone have an idea how to solve this problem? Thanks in advance
Upvotes: 3
Views: 5706
Reputation: 32512
I just stumbled accross this issue myself.
When you define a field name from a unicode object like this, you receive an error (as explained in the other answer):
>>> np.dtype([(u'foo', 'f')])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: data type not understood
Interestingly, when you create the same dtype object using the dictionary method, it works:
>>> np.dtype({'names': [u"foo"], 'formats': ["f"]})
dtype([(u'foo', '<f4')])
For the record: I'm using Python 2.7.6, with numpy 1.13.1. This issue doesn't exist with Python 3.4.3.
Here is the corresponding entry in the github numpy issue tracker: https://github.com/numpy/numpy/issues/2407
Upvotes: 1
Reputation: 879601
>>> field_name = ', '.join(["('%s', '<f8')" % w for w in fieldNameList])
>>> field_name
"('A', '<f8'), ('B', '<f8'), ('C', '<f8')"
makes field_name
a string. [field_name]
is a list containing one string.
Instead, the NumPy dtype can be specified as a list of tuples:
>>> [(w, '<f8') for w in fieldNameList]
[('A', '<f8'), ('B', '<f8'), ('C', '<f8')]
fieldNameList = [u'A', u'B', u'C']
fieldNameList = [name.encode('utf-8') for name in fieldNameList] # 1
tup1 = [(1,2,3)]
inarray = np.array(tup1, dtype=[(w, '<f8') for w in fieldNameList])
yields
array([(1.0, 2.0, 3.0)],
dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<f8')])
fieldNameList
must be a list of byte strings -- not unicode.
If fieldNameList
is a list of unicodes then you'll need to encode them first.Upvotes: 2