SkyNT
SkyNT

Reputation: 803

Complex matlab-like data structure in python (numpy/scipy)

I have data currently structured as following in Matlab

item{i}.attribute1(2,j)

Where item is a cell from i = 1 .. n each containing the data structure of multiple attributes each a matrix of size 2,j where j = 1 .. m. The number of attributes is not fixed.

I have to translate this data structure to python, but I am new to numpy and python lists. What is the best way of structuring this data in python with numpy/scipy?

Thanks.

Upvotes: 17

Views: 45696

Answers (4)

Girardi
Girardi

Reputation: 2809

A simple version of the answer by @dbouz , using the idea by @jmetz

class structtype():
    def __init__(self,**kwargs):
        self.Set(**kwargs)
    def Set(self,**kwargs):
        self.__dict__.update(kwargs)
    def SetAttr(self,lab,val):
        self.__dict__[lab] = val

then you can do

myst = structtype(a=1,b=2,c=3)

or

myst = structtype()
myst.Set(a=1,b=2,c=3)

and still do

myst.d = 4 # here, myst.a=1, myst.b=2, myst.c=3, myst.d=4

or even

myst = structtype(a=1,b=2,c=3)
lab = 'a'
myst.SetAttr(lab,10) # a=10,b=2,c=3 ... equivalent to myst.(lab)=10 in MATLAB

and you get exactly what you'd expect in matlab for myst=struct('a',1,'b',2,'c',3).

The equivalent of a cell of structs would be a list of structtype

mystarr = [ structtype(a=1,b=2) for n in range(10) ]

which would give you

mystarr[0].a # == 1
mystarr[0].b # == 2

Upvotes: 2

dbouz
dbouz

Reputation: 919

For some applications a dict or list of dictionaries will suffice. However, if you really want to emulate a MATLAB struct in Python, you have to take advantage of its OOP and form your own struct-like class.

This is a simple example for instance that allows you to store an arbitrary amount of variables as attributes and can be also initialized as empty (Python 3.x only). i is the indexer that shows how many attributes are stored inside the object:

class Struct:
    def __init__(self, *args, prefix='arg'): # constructor
        self.prefix = prefix
        if len(args) == 0:
            self.i = 0
        else:
            i=0
            for arg in args:
                i+=1
                arg_str = prefix + str(i)
                # store arguments as attributes
                setattr(self, arg_str, arg) #self.arg1 = <value>
            self.i = i
    def add(self, arg):
        self.i += 1
        arg_str = self.prefix + str(self.i)
        setattr(self, arg_str, arg)

You can initialise it empty (i=0), or populate it with initial attributes. You can then add attributes at will. Trying the following:

b = Struct(5, -99.99, [1,5,15,20], 'sample', {'key1':5, 'key2':-100})
b.add(150.0001)
print(b.__dict__)
print(type(b.arg3))
print(b.arg3[0:2])
print(b.arg5['key1'])

c = Struct(prefix='foo')
print(c.i) # empty Struct
c.add(500) # add a value as foo1
print(c.__dict__)

will get you these results for object b:

{'prefix': 'arg', 'arg1': 5, 'arg2': -99.99, 'arg3': [1, 5, 15, 20], 'arg4': 'sample', 'arg5': {'key1': 5, 'key2': -100}, 'i': 6, 'arg6': 150.0001}
<class 'list'>
[1, 5]
5

and for object c:

0
{'prefix': 'foo', 'i': 1, 'foo1': 500}

Note that assigning attributes to objects is general - not only limited to scipy/numpy objects but applicable to all data types and custom objects (arrays, dataframes etc.). Of course that's a toy model - you can further develop it to make it able to be indexed, able to be pretty-printed, able to have elements removed, callable etc., based on your project needs. Just define the class at the beginning and then use it for storage-retrieval. That's the beauty of Python - it doesn't really have exactly what you seek especially if you come from MATLAB, but it can do so much more!

Upvotes: 0

strpeter
strpeter

Reputation: 2774

If you are looking for a good example how to create a structured array in Python like it is done in MATLAB, you might want to have a look at the scipy homepage (basics.rec).

Example

x = np.zeros(1, dtype = [('Table', float64, (2, 2)),
                         ('Number', float),
                         ('String', '|S10')])

# Populate the array
x['Table']  = [1, 2]
x['Number'] = 23.5
x['String'] = 'Stringli'

# See what is written to the array
print(x)

The printed output is then:

[([[1.0, 2.0], [1.0, 2.0]], 23.5, 'Stringli')]

Unfortunately, I did not find out how you can define a structured array without knowing the size of the structured array. You can also define the array directly with its contents.

x = np.array(([[1, 2], [1, 2]], 23.5, 'Stringli'),
                dtype = [('Table', float64, (2, 2)),
                         ('Number', float),
                         ('String', '|S10')])

# Same result as above but less code (if you know the contents in advance)
print(x)

Upvotes: 1

jmetz
jmetz

Reputation: 12783

I've often seen the following conversion approaches:

matlab array -> python numpy array

matlab cell array -> python list

matlab structure -> python dict

So in your case that would correspond to a python list containing dicts, which themselves contain numpy arrays as entries

item[i]['attribute1'][2,j]

Note

Don't forget the 0-indexing in python!

[Update]

Additional: Use of classes

Further to the simple conversion given above, you could also define a dummy class, e.g.

class structtype():
    pass

This allows the following type of usage:

>> s1 = structtype()
>> print s1.a
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-40-7734865fddd4> in <module>()
----> 1 print s1.a
AttributeError: structtype instance has no attribute 'a'
>> s1.a=10
>> print s1.a
10

Your example in this case becomes, e.g.

>> item = [ structtype() for i in range(10)]
>> item[9].a = numpy.array([1,2,3])
>> item[9].a[1]
2

Upvotes: 28

Related Questions