Reputation: 1158
Kind of a basic question, which does not apply to SQLAlchemy specifically (same happened when I played with MySQL-python), but that's the library I'm currently working with.
Say I execute a query which returns the content of a fairly large table, on which an ordering is applied with respect to a certain attribute. In my case I'm fetching benchmark measurements from a table which references the processor on which the data has been recorded.
So what I have is:
measurements = session.query(Measurement)\
.join(Processor)\
.order_by(Processor.name)\
Now what I would like to do is iterate over the result set, but in terms of the subsets defined by the different processor names. Is there any convenient way to do this partitioning without a lot of boilerplate code?
Naively I would write something like
for proc_name, sublist in gen_partitions(measurements.all()):
set_up_some_stuff(proc_name)
for meas in sublist:
process(meas)
which means I have to implement a generator function gen_partitions:
def gen_partitions(measurements):
i = 0
while (i < len(measurements)):
plist = []
m = measurements[i]
plist.append(m)
i = i+1
while i < len(measurements) and \
measurements[i].processor.name == m.processor.name:
plist.append(measurements[i])
i = i+1
yield m.processor.name, plist
Feels like a lot of boilerplate. Is there a better way to do it?
Upvotes: 1
Views: 936
Reputation: 882093
for proc_name, ms in itertools.groupby(measurements, lambda m: m.processor.name):
set_up_some_stuff(proc_name)
for meas in ms:
process(meas)
would appear to meet your requirements -- any reasons you haven't considered standard library module itertools
?
Note that I've renamed the sublist
to ms
because it's an iterator, not a list. If you do need to have those measurements in a list (in order to do something else than just looping or them &c), that's easily achieved too, just add in the outer for
body a
sublist = list(ms)
Upvotes: 2