NewGuy
NewGuy

Reputation: 3413

Efficiently calculate averages of keys in dictionary

I have a dictionary that looks like this:

    'TICKET_URL/3250': {'cycle_time': 0, 'lead_time': 2496441},
    'TICKET_URL/3323': {'cycle_time': 346087, 'lead_time': 508469},
    'TICKET_URL/3328': {'cycle_time': 249802, 'lead_time': 521211},
    'TICKET_URL/3352': {'cycle_time': 504791, 'lead_time': 504791},
    'TICKET_URL/3364': {'cycle_time': 21293, 'lead_time': 21293},
    'TICKET_URL/3367': {'cycle_time': 102558, 'lead_time': 189389},
    'TICKET_URL/3375': {'cycle_time': 98735,  'lead_time': 98766}
}

How can I efficiently calculate the average cycle_time and lead_time (independently). Right now I'm iterating over the dictionary twice - once for cycle_time and once for lead_time. Can I do this in a single pass?

Currently:

average_cycle = (
    sum([story["cycle_time"] for story in stories.values()]) / len(stories)
)

Upvotes: 0

Views: 54

Answers (3)

robino16
robino16

Reputation: 315

I don't really see the issue with your current implementation. I would suggest adding a avg() function to maybe enhance readability:

def avg(x):
    return sum(x) / len(x) 
avg_cycle = avg([story['cycle_time'] for story in stories.values()])
avg_lead_time = avg([story['lead_time'] for story in stories.values])

Upvotes: 0

Marat
Marat

Reputation: 15738

If you don't mind pandas,

import pandas as pd

pd.DataFrame(data.values()).mean().to_dict()

will produce:

{'cycle_time': 189038.0, 'lead_time': 620051.4285714285}

As a bonus, it will also handle missing values nicely.

Upvotes: 2

Leshawn Rice
Leshawn Rice

Reputation: 548

count = len(stories.values())
cycle_total = 0
lead_total = 0
for story in stories.values():
    cycle_total += story.get("cycle_time", 0)
    lead_total += story.get("lead_time", 0)

cycle_avg = cycle_total / count
lead_avg = lead_total / count

Upvotes: 1

Related Questions