user36800
user36800

Reputation: 2259

How to interpret Python output dtype='<U32'?

I am taking an online course, and the following supposedly demonstrates that "NumPy arrays: contain only one type":

In [19]: np.array([1.0, "is", True])
Out[19]:
array(['1.0', 'is', 'True'],
dtype='<U32')

At first, I thought that the output was a form of error message, but this is not confirmed by a web search. In fact, I haven't come across an explanation....can anyone explain how to interpret the output?

Afternote: After reviewing the answers, the dtype page, and the numpy.array() page, it seems that dtype='<U32' would be more accurately described as dtype('<U32'). Is this correct? I seems so to me, but I'm a newbie, and even the numpy.array() page assigns a string to the dtype parameter rather than an actual dtype object.

Also, why does '<U32' specify a 32-character string when all of the elements are much shorter strings?

Upvotes: 17

Views: 32215

Answers (2)

adlopez15
adlopez15

Reputation: 4357

dtype='<U32' is a little-endian 32 character string.

The documentation on dtypes goes into more depth about each of the character.

'U' Unicode string

Several kinds of strings can be converted. Recognized strings can be prepended with '>' (big-endian), '<' (little-endian), or '=' (hardware-native, the default), to specify the byte order.

Examples:

dt = np.dtype('f8')   # 64-bit floating-point number
dt = np.dtype('c16')  # 128-bit complex floating-point number
dt = np.dtype('a25')  # 25-length zero-terminated bytes
dt = np.dtype('U25')  # 25-character string```

Upvotes: 10

Amadan
Amadan

Reputation: 198314

It is fully explained in the manual:

Several kinds of strings can be converted. Recognized strings can be prepended with '>' (big-endian), '<' (little-endian), or '=' (hardware-native, the default), to specify the byte order.

[...]

The first character specifies the kind of data and the remaining characters specify the number of bytes per item, except for Unicode, where it is interpreted as the number of characters. The item size must correspond to an existing type, or an error will be raised. The supported kinds are

[...]

'U'        Unicode string

So, a little-endian Unicode string of 32 characters.

Upvotes: 15

Related Questions