Reputation: 26747
I have a series of functions that I apply to each record in a dataset to generate a new field I store in a dictionary (the records—"documents"—are stored using MongoDB). I broke them all up as they are basically unrelated, and tie them back together by passing them as a list to a function that iterates through each operation for each record and adds on the results.
What irks me is how I'm going about it in what seems like a fairly inelegant manner; semi-duplicating names among other things.
def _midline_length(blob):
'''Generate a midline sequence for *blob*'''
return 42
midline_length = {
'func': _midline_length,
'key': 'calc_seq_midlen'} #: Midline sequence key/function pair.
Lots of these...
do_calcs = [midline_length, ] # all the functions ...
Then called like:
for record in mongo_collection.find():
for calc in do_calcs:
record[calc['key']] = calc['func'](record) # add new data to record
# update record in DB
Splitting up the keys like this makes it easier to remove all the calculated fields in the database (pointless after everything is set, but while developing the code and methodology it's handy).
I had the thought to maybe use classes, but it seems more like an abuse:
class midline_length(object):
key = 'calc_seq_midlen'
@staticmethod
def __call__(blob):
return 42
I could then make a list of instances (do_calcs = [midline_length(), ...]
) and run through that calling each thing or pulling out it's key
member. Alternatively, it seems like I can arbitrarily add members to functions, def myfunc():
then myfunc.key = 'mykey'
...that seems even worse. Better ideas?
Upvotes: 1
Views: 97
Reputation: 1606
This is a perfect job for a decorator. Something like:
_CALC_FUNCTIONS = {}
def calcfunc(orig_func):
global _CALC_FUNCTIONS
# format the db name from the function name.
key = 'calc_%s' % orig_func.__name__
# note we're using a set so these will
_CALC_FUNCTIONS[key] = orig_func
return orig_func
@calcfunc
def _midline_length(blob):
return 42
print _CALC_FUNCTIONS
# prints {'calc__midline_length': <function _midline_length at 0x035F7BF0>}
# then your document update is as follows
for record in mongo_collection.find():
for key, func in _CALC_FUNCTIONS.iteritems():
record[key] = func(record)
# update in db
Note that you could also store the attributes on the function object itself like Dietrich pointed out but you'll probably still need to keep a global structure to keep the list of functions.
Upvotes: 0
Reputation: 213748
You might want to use decorators for this purpose.
import collections
RecordFunc = collections.namedtuple('RecordFunc', 'key func')
def record(key):
def wrapped(func):
return RecordFunc(key, func)
return wrapped
@record('midline_length')
def midline_length(blob):
return 42
Now, midline_length
is not actually a function, but it is a RecordFunc
object.
>>> midline_length
RecordFunc(key='midline_length', func=<function midline_length at 0x24b92f8>)
It has a func
attribute, which is the original function, and a key
attribute.
If they get added to the same dictionary, you can do it in the decorator:
RECORD_PARSERS = {}
def record(key):
def wrapped(func):
RECORD_PARSERS[key] = func
return func
return wrapped
@record('midline_length')
def midline_length(blob):
return 42
Upvotes: 3