Michael
Michael

Reputation: 486

Python struct like Matlab

I seem to have found lots of hack answers, without a 'standardized' answer to this questions. I am looking for an implementation of Matlab's struct in Python, specifically with the two following capabilities:

  1. in struct 's', access field value 'a' using dot notation (i.e. s.a)
  2. create fields on the fly, without initialization of dtype, format (i.e. s.b = np.array([1,2,3,4]) )

Is there no way to do this in Python? To date, the only solution I have found is here, using a dummy class structtype(). This works but feels a little hackish. I also thought maybe scipy would expose its mat_struct, used in loadmat(), but I couldn't find a public interface to it. What do other people do? I'm not too worried about performance for this struct, its more of a convenience.

Upvotes: 0

Views: 4860

Answers (3)

Sari
Sari

Reputation: 626

The simplest and intuitively most similar Python implementation would be to use type to instantiate a temporary class. It is practically similar to making a dummy class, but I think it semantically expresses the intent of a struct--like object more clearly.

>>> s = type('', (), {})()
>>> s.a = 4
>>> s.a
4

Here, type is used to create a nameless class (hence the '') with no bases (or parent classes, indicated by the empty tuple) and no default class attributes (the empty dictionary) and the final () instantiates the class/struct. Bear in mind that values passed to the dictionary do not show up in the instance's __dict__ attribute, but this fact may not be relevant to you. This method also works in older versions (< 3.x) of Python.

Upvotes: 2

hpaulj
hpaulj

Reputation: 231605

In Octave I did:

octave:2>      x.a = 1;
octave:3>      x.b = [1, 2; 3, 4];
octave:4>      x.c = "string";
octave:7> save -7 test.mat x

In ipython (2.7):

In [27]: from scipy.io import loadmat    
In [28]: A=loadmat('test.mat')

In [29]: A
Out[29]: 
{'__globals__': [],
 '__header__': 'MATLAB 5.0 MAT-file, written by Octave 3.8.2, 2015-12-04 02:57:47 UTC',
 '__version__': '1.0',
 'x': array([[([[1.0]], [[1.0, 2.0], [3.0, 4.0]], [u'string'])]], 
      dtype=[('a', 'O'), ('b', 'O'), ('c', 'O')])}

In this case A['x'] is a numpy structured array, with 3 dtype=object fields.

In [33]: A['x']['b'][0,0]
Out[33]: 
array([[ 1.,  2.],
       [ 3.,  4.]])

In [34]: A['x'][0,0]
Out[34]: ([[1.0]], [[1.0, 2.0], [3.0, 4.0]], [u'string'])

In [35]: A['x'][0,0]['b']
Out[35]: 
array([[ 1.,  2.],
       [ 3.,  4.]])

Since x comes from MATLAB I have to index it with [0,0].

octave:9> size(x)
ans =
   1   1

I can load A with a different switch, and access attributes with .b format:

In [62]: A=loadmat('test.mat',struct_as_record=False)

In [63]: A['x'][0,0].b
Out[63]: 
array([[ 1.,  2.],
       [ 3.,  4.]])

In this case the elements of A['x'] are of type <scipy.io.matlab.mio5_params.mat_struct at 0x9bed76c>

Some history might help. MATLAB originally only had 2d matricies. Then they expanded it to allow higher dimensions. cells were added, with the same 2d character, but allowing diverse content. structures were added, allow 'named' attributes. The original MATLAB class system was built on structures (just link certain functions to a particular class structure). MATLAB is now in its 2nd generation class system.

Python started off with classes, dictionaries, and lists. Object attributes are accessed with the same . syntax as MATLAB structures. dictionaries with keys (often, but not always strings). Lists indexed with integers, and have always allowed diverse content (like cells). And with a mature object class system, it is possible construct much more elaborate data structures in Python, though access is still governed by basic Python syntax.

numpy adds n-dimensional arrays. A subclass np.matrix is always 2d, modeled on the old style MATLAB matrix. An array always has the same kind of elements. But dtype=object arrays contain pointers to Python objects. In many ways they are just Python lists with an array wrapper. They are close to MATLAB cells.

numpy also has structured arrays, with a compound dtype, composed of fields. fields are accessed by name. np.recarray is a structured array, with added ability to access fields with the . syntax. That makes them look a lot like MATLAB arrays of structures.

Upvotes: 0

user2357112
user2357112

Reputation: 281642

If you're on 3.3 and up, there's types.SimpleNamespace. Other than that, an empty class is probably your best option.

Upvotes: 4

Related Questions