Stiivi
Stiivi

Reputation: 2097

How to implement class/object metadata in Python?

I am working on a structured data analysis framework which is based on streaming data between nodes. Currently nodes are implemented as subclasses of root Node class provided by the framework. For each node class/factory I need metadata, such as list of node's attributes, their description, node output. The metadata might be both: for end-users in front-end application or for programming use - some other stream management tools. In the future there will be more of them.

(Note that I just started to learn python while writing that code)

Currently the metadata are provided in a class variable

class AggregateNode(base.Node):
    """Aggregate"""

    __node_info__ = {
        "label" : "Aggregate Node",
        "description" : "Aggregate values grouping by key fields.",
        "output" : "Key fields followed by aggregations for each aggregated field. Last field is "
                   "record count.",
        "attributes" : [
            {
                 "name": "keys",
                 "description": "List of fields according to which records are grouped"
            },
            {
                "name": "record_count_field",
                 "description": "Name of a field where record count will be stored. "
                                "Default is `record_count`"
            }
        ]
    }

More examples can be found here.

I feel that it can be done in much cleaner way. There is one restriction: as nodes are custom subclasses classes, there should be minimal interference with potential future attribute names.

What I was thinking to do was to split the current node_info. It was meant to be private to the framework, but now I realize it has much wider use. I was thinking about using node_ attributes: will have common attribute namespace, not taking too much of names from potential custom node attributes.

My question is: What is the most common way of providing such metadata in python programs? Single variable with a dictionary? Multiple variables, one for each metadata attribute? (this would conflict with the restriction) Custom class/structure? Use some kind of prefix, like node_* and use multiple variables?

Upvotes: 1

Views: 5049

Answers (3)

rt47
rt47

Reputation: 1

The only element of a Python class able to modify the class definition itself (hence meta-data) is the __new__() function, new is called before the object is actually created, and before is initiated. You can use it to read/modify your classes/nodes internal structure before they gets initializated with __init__()

Upvotes: 0

SingleNegationElimination
SingleNegationElimination

Reputation: 156238

A lot of the functionality you're describing overlaps with epydoc:

>>> class AggregateNode(base.Node):
...     r"""
...     Aggregate values grouping by key fields.
... 
...     @ivar keys: List of fields according to which records are grouped
... 
...     @ivar record_count_field: Name of a field where record count will be
...                               stored.
...     """
...     record_count_field = "record_count"
...     
...     def get_output(self):
...         r"""
...         @return: Key fields followed by aggregations for each aggregated field.
...                  Last field is record count.
...         """
... 
>>> import epydoc.docbuilder
>>> api = epydoc.docbuilder.build_doc(AggregateNode)
>>> api.variables['keys'].descr.to_plaintext(None)
u'List of fields according to which records are grouped\n\n'
>>> api.variables['record_count_field'].value.pyval
'record_count'

Upvotes: 1

Jaime Soriano
Jaime Soriano

Reputation: 7359

I'm not sure if there is some "standard" way to store custom metadata in python objects, but as an example, the python implementation of dbus adds attributes with the "_dbus" prefix to the published methods and signals.

Upvotes: 1

Related Questions