user2480542
user2480542

Reputation: 2945

Aggregate Monthly Values

I have a python list containing multiple list:

 A = [['1/1/1999', '3.0'], 
      ['1/2/1999', '4.5'],
      ['1/3/1999', '6.8'],
      ......
      ......

      ['12/31/1999', '8.7']]

What I need is to combine all the values corresponding to each month, preferably in the form of a dictionary containing months as keys and their values as values.

Example:

   >>> A['1/99']
   >>> ['3.0', '4.5', '6.8'.....]

Or in the form of a list of list, so that:

Example:

  >>> A[0]
  >>> ['3.0', '4.5', '6.8'.....]

Thanks.

Upvotes: 4

Views: 1828

Answers (3)

here is my solution without includes

def getKeyValue(lst):
    a = lst[0].split('/')
    return '%s/%s' % (a[0], a[2][2:]), lst[1]

def createDict(lst):
    d = {}
    for e in lst:
        k, v = getKeyValue(e)
        if not k in d:    d[k] = [v]
        else:             d[k].append(v)
    return d

A = [['1/1/1999', '3.0'],
     ['1/2/1999', '4.5'],
     ['1/3/1999', '6.8'],
     ['12/31/1999', '8.7']]

print createDict(A)
>>>{'1/99': ['3.0', '4.5', '6.8'], '12/99': ['8.7']}

Upvotes: 1

Owen
Owen

Reputation: 1736

    from collections import defaultdict
    from datetime import date

    month_aggregate = defaultdict (list)
    for [d,v] in A:
        month, day, year = map(int, d.split('/'))
        date = date (year, month, 1)
        month_aggregate [date].append (v)

I iterate over each date and value, I pull out the year and month and create a date with those values. I then append the value to a list associated with that year and month.

Alternatively, if you want to use a string as a key then you can

    from collections import defaultdict

    month_aggregate = defaultdict (list)
    for [d,v] in A:
        month, day, year = d.split('/')
        month_aggregate [month + "/" + year[2:]].append (v)

Upvotes: 2

Joe Kington
Joe Kington

Reputation: 284582

Pandas is perfect for this, if you don't mind another dependency:

For example:

import pandas
import numpy as np

# Generate some data
dates = pandas.date_range('1/1/1999', '12/31/1999')
values = (np.random.random(dates.size) - 0.5).cumsum()

df = pandas.DataFrame(values, index=dates)

for month, values in df.groupby(lambda x: x.month):
    print month
    print values

The really neat thing, though, is aggregation of the grouped DataFrame. For example, if we wanted to see the min, max, and mean of the values grouped by month:

print df.groupby(lambda x: x.month).agg([min, max, np.mean])

This yields:

         min       max      mean
1  -0.812627  1.247057  0.328464
2  -0.305878  1.205256  0.472126
3   1.079633  3.862133  2.264204
4   3.237590  5.334907  4.025686
5   3.451399  4.832100  4.303439
6   3.256602  5.294330  4.258759
7   3.761436  5.536992  4.571218
8   3.945722  6.849587  5.513229
9   6.630313  8.420436  7.462198
10  4.414918  7.169939  5.759489
11  5.134333  6.723987  6.139118
12  4.352905  5.854000  5.039873

Upvotes: 3

Related Questions