DarthOpto
DarthOpto

Reputation: 1652

Making a new dictionary with counts

I have a list of dictionaries which looks like this:

[{'year_month': '2017-07', 'issue_priority': 'high', 'issue_number': 2153},
 {'year_month': '2017-07', 'issue_priority': 'normal', 'issue_number': 2179},
 {'year_month': '2017-07', 'issue_priority': 'low', 'issue_number': 2169},
 {'year_month': '2017-06', 'issue_priority': 'blocker', 'issue_number': 1998}]

What I would like to do is to take that list of dictionaries and make it into a dictionary with the following:

{'2017-07': {'high': count of issue_numbers, 'low': count of issue_numbers},
 '2017-06': {'high': count of issue_numbers, 'low': count of issue_numbers}, ...}

I am just getting back into Python and have never done something like this before so I don't really know where to look.

I know how to iterate over the list but I am not really sure how to get to where I would like to go.

I did the following which has gotten part of the way there, now I am just looking for how to get the count of the issue numbers:

output = {}
for item in issues # issues being my list of dictionaries:
    month = item.pop('year_month')
    output[month] = item

Upvotes: 0

Views: 61

Answers (2)

Mad Physicist
Mad Physicist

Reputation: 114310

This looks like something that should work nicely with a Counter, only the inner dictionaries will be counters. You need to do a couple of extra steps to split up by date first.

You need to split the dictionary by year_month. Then, you need to count of all the issue_priority types for that year and month:

from collections import Counter

dates = set(issue['year_month'] for issue in issues)
output = {}
for ym in dates:
    subset = [issue for issue in issues if issue['year_month'] == ym]
    output[ym] = Counter(issue['issue_priority'] for issue in subset)

Many of the steps shown above can be reduced to a smaller line count, but at the cost of clarity and readability, in my opinion. That being said:

dates = set(issue['year_month'] for issue in issues)
output = {ym: Counter(issue['issue_priority'] for issue in issues if issue['year_month'] == ym) for ym in dates}

The output of both methods is

{'2017-06': Counter({'blocker': 1}),
 '2017-07': Counter({'high': 1, 'low': 1, 'normal': 1})}

Counter is just a slightly fancier dict, so you can access it as ususal. To avoid problems with missing priority types, just use the get method instead of direct indexing:

output['2017-06']['high']

will raise an error, so do this instead:

output['2017-06'].get('high', 0)

Upvotes: 1

Brett Beatty
Brett Beatty

Reputation: 5963

One option would be to iterate your original list of issues and increment the count in your result manually:

def sort_by_priority(issues):
    result = {}
    for issue in issues:
        priority = issue['issue_priority']
        month = result.setdefault(issue['year_month'], {})
        month[priority] = month.get(priority, 0) + 1
    return result

Upvotes: 2

Related Questions