Reputation: 1435
Input data:
results= [
{
"timestamp_datetime": "2014-03-31 18:10:00 UTC",
"job_id": 5,
"processor_utilization_percentage": 72
},
{
"timestamp_datetime": "2014-03-31 18:20:00 UTC",
"job_id": 2,
"processor_utilization_percentage": 60
},
{
"timestamp_datetime": "2014-03-30 18:20:00 UTC",
"job_id": 2,
"processor_utilization_percentage": 0
}]
Output has to be sorted like below, grouping by job_id
in ascending order:
newresult = {
'2':[{ "timestamp_datetime": "2014-03-31 18:20:00 UTC",
"processor_utilization_percentage": 60},
{"timestamp_datetime": "2014-03-30 18:20:00 UTC",
"processor_utilization_percentage": 0},]
'5':[{
"timestamp_datetime": "2014-03-31 18:10:00 UTC",
"processor_utilization_percentage": 72},
],
}
What is pythonic way to do this?
Upvotes: 1
Views: 86
Reputation: 82949
You can use itertools.groupby
to group the results
by their job_id
:
from itertools import groupby
new_results = {k: list(g) for k, g in groupby(results, key=lambda d: d["job_id"])}
The result is a dictionary, i.e. it has no particular order. If you want to iterate the values in ascending order, you can just do something like this:
for key in sorted(new_results):
entries = new_results[key]
# do something with entries
Update: as Martijn points out, this requires the results
list to be sorted by the job_id
s (as it is in your example), otherwise entries might be lost.
Upvotes: 3
Reputation: 91
Assuming you really didn't want the the job_id in the newresult:
from collections import defaultdict
newresult = defaultdict(list)
for result in results:
job_id = result['job_id']
newresult[job_id].append(
{'timestamp_datetime':result['timestamp_datetime'],
'processor_utilization_percentage':result['processor_utilization_percentage']}
)
#print newresult
I don't really see a way to do this with a dictionary comprehension, but I'm sure there's someone out there with more experience in doing that sort of thing who could pull it off. This is pretty straightforward, though.
Upvotes: 0
Reputation: 1124858
You are grouping; this is easiest with a collections.defaultdict()
object:
from collections import defaultdict
newresult = defaultdict(list)
for entry in result:
job_id = entry.pop('job_id')
newresult[job_id].append(entry)
newresult
is a dictionary and these are not ordered; if you need to access job ids in ascending order, sort the keys as you list them:
for job_id in sorted(newresult):
# loops over the job ids in ascending order.
for job in newresult[job_id]:
# entries per job id
Upvotes: 5