Reputation: 3343
First off I will apologize to the arbitraryness of this question but I am rewriting some of my scripts to use Numpy arrays instead of nested python lists (for performance and memory) but I'm still struggling with their declaration.
I am trying to create a structure using numpy arrays, I am starting off with 1000 (arbitrary value) elements in the array where each element should contain a float (as [x][0]) and a nested array containing coordinates (so 10.0000 x 2 floats PER top level element) (as [x][1], with each element in the nested array accessible as [x][1][y][z] where y is the element in nested array and z specified which of the 2 coordinates). The following question Nested Structured Numpy Array creates a nigh identical structure (as reference for my question and my desired structure).
Schematic raw data example:
time 0
m/z 10 int 10
m/z 20 int 20
m/z 30 int 1000
...
time 1
<repeat>
I have read that i haveto use the dtype part to define the nested array but I am not quite sure on the declaration part of the dimensions for an empty array, could anyone give me a hand? Here is what I came up with so far.
data=np.zeroes((1000,2 /* Now add nested array */), dtype=[('time', 'f'), [('m/z','f'), ('intensity','f')]])
PS: A matrix might be a better option for this?
Upvotes: 3
Views: 958
Reputation: 25207
>>> a = np.zeros(1000, dtype='float32, (10000,2)float32')
>>> a[200][0]
0.0
>>> a[200][1][2000]
array([ 0., 0.], dtype=float32)
Note that this creates 1000 arrays of dimension (10000,2). That's fine if you only ever do operations that look at just one of those arrays. Using a separate (1000,10000,2) array instead you could take better advantage of vectorized operations in NumPy. You could for example increment all the second coordinates in one operation like this:
>>> b = np.zeros((1000,10000,2))
>>> b[:,:,1] += 1
Trying to do the same with a[:][1][:,1]
is an error.
Upvotes: 6