mnmbs
mnmbs

Reputation: 363

Mean value of multikey dictionary

I want to find the mean price of an item in a dictionary that have pairs of item,shop as key and the price as value

example dictionary

{('item1', 'shop1'): 40,
('item2', 'shop2'): 14,
('item1', 'shop3'): 55,

for example i want to find the mean price of item1. Is it possible with a multikey dictionary or should i change it? Any ideas?

Thanks

Upvotes: 1

Views: 92

Answers (4)

Steve Misuta
Steve Misuta

Reputation: 1033

I would solve this using pandas DataFrame:

# create a test dict like the question
my_dict = dict(zip([
    ('item'+str(i), 'shop'+str(k)) for i in range(5) for k in range(3)],
    [random.randint(1,10) for j in range(15)
    ]))

# create a DataFrame wih MultiIndex
ndx=pd.MultiIndex.from_tuples(list(my_dict.keys()), names=['item','shop'])
df = pd.DataFrame(list(my_dict.values()), index=ndx, columns=['price'])
print('\n', df)

# reset index and use groupby to get means
df.reset_index(inplace=True)
item_mean = df.groupby('item').mean()
print('\n',item_mean)

              price
item  shop        
item3 shop0      5
      shop2      3
item1 shop0      4
item3 shop1      7
item4 shop0      7
item0 shop0     10
item2 shop1      3
      shop0      2
item1 shop1     10
item4 shop2      5
      shop1      3
item1 shop2      2
item0 shop1      1
      shop2      8
item2 shop2      7

           price
item           
item0  6.333333
item1  5.333333
item2  4.000000
item3  5.000000
item4  5.000000

Upvotes: 0

hugos
hugos

Reputation: 1323

It is possible. Not sure if this is the right data structure to your problem but you can do it like this.

First you select all the keys with the item you want, here I'm selecting 'item1':

interesting_keys = filter(lambda k: k[0] == 'item1', a.keys())

Now you can sum all those elements and divide by the number of elements.

result = sum([a[k] for k in interesting_keys])/len(interesting_keys)

If you want to create a new dictionary reduced to one element per key followed by the mean, you may do something that looks like this:

def group_prices(prices):
    grouped_prices = {}
    number_items = {}
    for k, v in prices.iteritems():
        grouped_prices[k[0]] = grouped_prices.get(k[0], 0) + v
        number_items[k[0]] = number_items.get(k[0], 0) + 1
    return {k:v/number_items[k] for (k,v) in grouped_prices.iteritems()}

Upvotes: 1

Andy Hayden
Andy Hayden

Reputation: 375715

Since this is labelled pandas... If you make this a pandas Series you can groupby the 0th level:

In [11]: d = {('item1', 'shop1'): 40, ('item2', 'shop2'): 14,('item1', 'shop3'): 55}

In [12]: s = pd.Series(d)

In [13]: s
Out[13]:
item1  shop1    40
       shop3    55
item2  shop2    14
dtype: int64

In [14]: s.groupby(level=0).mean()
Out[14]:
item1    47.5
item2    14.0
dtype: float64

Upvotes: 1

Joe T. Boka
Joe T. Boka

Reputation: 6581

You can create a Pandas DataFrame using nested lists. You can then use Pandas groupby to get the mean you're looking for.

    import pandas as pd
    df = pd.DataFrame([['item1', 'shop1', 40],
    ['item2', 'shop2', 14],
    ['item1', 'shop3', 55]], columns=('item', 'shop', 'price'))
    df
        item    shop    price
    0   item1   shop1   40
    1   item2   shop2   14
    2   item1   shop3   55
    result_mean = df.groupby('item')['price'].mean()
    result_mean
    item
    item1    47.5
    item2    14.0
    Name: price, dtype: float64

Upvotes: 1

Related Questions