user1901671
user1901671

Reputation: 3

How to perform an operation on certain indexes in different lists of a list, and then group them together by another index

There is so much code below to show you the level of skill I am having to use to get this task done. Beginner techniques only please.

def get_monthly_averages(original_list):

#print(original_list)
daily_averages_list = [ ]
product_vol_close = [ ] # used for numerator
monthly_averages_numerator_list = [ ]
for i in range (0, len(original_list)):
    month_list = original_list[i][0][0:7]        #Cutting day out of the date leaving Y-M 
    volume_str = float(original_list[i][5])        #V
    adj_close_str = float(original_list[i][6])       #C
    daily_averages_sublists = [month_list,volume_str,adj_close_str]    #[Date,V,C]
    daily_averages_list.append(daily_averages_sublists)
for i in range (0, len(daily_averages_list)):      #Attempt at operation
    vol_close = daily_averages_list[i][1]*daily_averages_list[i][2]
    month_help = daily_averages_list[i][0]
    product_vol_sublists = [month_help,vol_close]
    product_vol_close.append(product_vol_sublists)
    print(product_vol_close)
    for i in range (0, len(product_vol_close)):     #<-------TROUBLE STARTS
        for product_vol_close[i][0]==product_vol_close[i][0]:  #When the month is the same
            monthly_averages_numerator = product_vol_close[i][1]+product_vol_close[i][1]
          # monthly_averages_numerator = sum(product_vol_close[i][1])         #tried both
            month_assn = product_vol_close[i][0]
            numerator_list_sublists = [month_assn,monthly_averages_numerator]                
            monthly_averages_numerator_list.append(numerator_list_sublists)
            print(monthly_averages_numerator_list)

Original List is in form:

[['2004-08-30', '105.28', '105.49', '102.01', '102.01', '2601000', '102.01'],
['2004-08-27', '108.10', '108.62', '105.69', '106.15', '3109000', '106.15'], 
['2004-08-26', '104.95', '107.95', '104.66', '107.91', '3551000', '107.91'],
['2004-08-25', '104.96', '108.00', '103.88', '106.00', '4598900', '106.00'],
['2004-08-24', '111.24', '111.60', '103.57', '104.87', '7631300', '104.87'], 
['2004-08-23', '110.75', '113.48', '109.05', '109.40', '9137200', '109.40'], 
['2004-08-20', '101.01', '109.08', '100.50', '108.31', '11428600', '108.31'],
['2004-08-19', '100.00', '104.06', '95.96', '100.34', '22351900', '100.34']]

The 0 index is the date, 5th is V, 6th is C.

I need to perform the operation below for each month individually and in the end have a tuple with two elements; 0 being the month-year and 1 being the 'average_price' as seen below. I am trying to end up taking the 5th and 6th values from each list within the original list, and perform an operation as follows...(I NEED TO USE BEGINNER TECHNIQUES FOR MY CLASS...thank you for understanding)

average_price = (V1* C1 + V2 * C2 +...+ Vn * Cn)/(V1 + V2 +...+ Vn)

(V=each 5th element in lists C=each 6th element in lists)

My problem is with only performing the above task to a month alone and not the whole list, and then having a result such as,

[('month1',average_price),('month2',average_price),...]

I made up the

for i in range (0, len(product_vol_close)):     #<-------TROUBLE STARTS
    for product_vol_close[i][0]==product_vol_close[i][0]:   

when in the same month and year to group them together.

to try and show what I am trying to get it to do. I cant find any answers on how to get this to work the way I want it to.

If there is still confusion please comment! Thank you again for your patience, understanding and help on this matter!

I am completely lost.

Upvotes: 0

Views: 105

Answers (2)

Burhan Khalid
Burhan Khalid

Reputation: 174662

The key here is to stop using lists and use a dictionary, which will take care of grouping things together for you.

Typically you would use defaultdict from the collections module, but as this looks like homework that may not be allowed, so here is the "long" way to do it.

In your sample data, there is only one row for each date, so I will assume the same in the code snippet. To make our life easy, we'll store the dates by the year-month; since that's what we are basing our calculations on:

>>> date_scores = {}
>>> for i in data:
...    year_month = i[0][:7] # this will be our key for the dictionary
...    if year_month not in date_scores:
...         # In this loop, we check if the key exists or not; if it doesn't
...         # we initialize the dictionary with an empty list, to which we will
...         # add the data for each day.
...         date_scores[year_month] = []
...    
...    date_scores[year_month].append(i[1:]) # Add the data to the list for that
...                                          # for the year-month combination
... 
>>> date_scores
{'2004-08': [['105.28', '105.49', '102.01', '102.01', '2601000', '102.01'], ['108.10', '108.62', '105.69', '106.15', '3109000', '106.15'], ['104.95', '107.95', '104.66', '107.91', '3551000', '107.91'], ['104.96', '108.00', '103.88', '106.00', '4598900', '106.00'], ['111.24', '111.60', '103.57', '104.87', '7631300', '104.87'], ['110.75', '113.48', '109.05', '109.40', '9137200', '109.40'], ['101.01', '109.08', '100.50', '108.31', '11428600', '108.31'], ['100.00', '104.06', '95.96', '100.34', '22351900', '100.34']]}

Now for each year-month combination, we have a list in the dictionary. This list has sub-lists for each day in that month for which we have data. Now we can do things like:

>>> print 'We have data for {} days for 2004-08'.format(len(date_scores['2004-08']))
We have data for 8 days for 2004-08

I think this solves the majority of your problems with the loop.

Upvotes: 1

Blckknght
Blckknght

Reputation: 104752

My suggestion is to stick to a single main loop over the rows of your data. Something like this (pseudocode):

current_month = None
monthly_value = []
monthly_volume = []

for row in data:
    date, volume, price = parse(row) # you need to write this yourself
    month = month_from_date(date) # this too

    if month != current_month: # do initialization for each new month
        current_month = month
        monthly_value.append(0)
        monthly_volume.append(0)

    monthly_value[-1] += volume*price # indexing with -1 gives last value
    monthly_volume[-1] += volume

You can then do a second loop to compute the averages. Note that this requires that your data be grouped by month. If your data is not so nicely organized, you could replace the lists in the above code with dictionaries (indexed by month). Or you could use a defaultdict (from the collections module in the standard library) which wouldn't require any per-month initialization. But perhaps that's a little more advanced than you want.

Upvotes: 0

Related Questions