Chris0304
Chris0304

Reputation: 11

Dynamically increase number of fields in numpy array

With the help of "Complex matlab-like data structure in python (numpy/scipy)" I came up with:

s=(5,3)
a=np.zeros(s, dtype=[('Int1', int),
                     ('Int2', int),
                     ('Str1', '|S5')])

a[0,0]=(1,2,'abcde')
a[0,1]=((5,2,'fghij'),(7,9,'klmno'))

The problem is, that in some fields of my array a, just like in field a[0,1], I want to add one or more extra "information" just like in my code example. I don't know how many extra information I have to write into which part of my matrix, but I will always have to write tuples with the dtype=[(int, int, string)].

Of course, I get an error when I try to write into a[0,1] the way I do.

I would like to keep my matrix a 2-dimensional, but I would like to write several instances of my dtype=[int, int, str] into one field, similar to what I tried in field a[0,1].

Hopefully, I could explain my problem in a comprehensible way.

Upvotes: 1

Views: 478

Answers (2)

Chris0304
Chris0304

Reputation: 11

My code would look like this now:

s=(5,3)   
a=np.zeros(s, dtype=object)  
a[0,0]=(1,2,'abcde')  
a[0,1]=((5,2,'fghij'),(7,9,'klmno'))

I can see/access the entries with:

print(a[0,1])
print(a[0,1][0])
print(a[0,1][1])

Upvotes: 0

hpaulj
hpaulj

Reputation: 231375

A numpy array is probably the wrong data structure for this kind of flexibility. Once created your array a takes up a fixed amount of memory. It has 15 (5*3) records, and each record contains the 2 ints and one string with 5 characters. You can modify values, but you can't add new records, or change one record into a composite of two records.

Lists give you the flexibility to add elements and to change their nature. A list contains pointers to objects located else where in memory.

An array of dtype=object behaves much like a list. Its data buffer is the same sort of pointers. a=np.zeros((3,5), dtype=object) is a 2d array, where each element can be a tuple, list, number, None, tuple of tuples, etc. But with that kind of array you loose a lot of the 2d numeric calculation abilities.

With your structured array, the only way to increase its size or add fields is to make a new array and copy data over. There are functions that assist in adding fields, but they do, in one way or other, what I just described.


With your definition, there are 3 fields, ['Int1','Int2','Str1']

a=np.zeros(s, dtype=[('Int1', int),
                     ('Int2', int),
                     ('Str1', '|S5')])

Increasing the number of fields (by that concept of fields) would be something like

a1=np.zeros(s, dtype=[('Int1', int),
                     ('Int2', int),
                     ('Str1', '|S5'),
                     ('Str2', '|S5')])

That is adding a field named 'Str2'. You could fill it with

for name in a.dtype.fields: a1[name] = a[name]

Now all records in a a2 have the same data as in a, but they also have a blank Str2 field. You could set that field for each element individually, or as group with:

a['Str2'] = ...

But your attempt to change A[0,1] into a tuple of tuples is quite different. It's like trying to replace an element of a regular numeric array with two numbers:

x = np.arange(10)
x[3] = [3,5]

That works for lists, x=range(10), but not for arrays.

Upvotes: 1

Related Questions