kenorb
kenorb

Reputation: 166677

How to calculate average of last items in 'for' loop?

I've the following Python 3 code:

import random
import numpy as np
data = []
for i in range(0, 100):
    value = random.randrange(100)
    avg10 = np.average(data[:-10]['value'])
    data += [{'value': value, 'avg10': avg10}]

which aims to generate 100 random numbers in the list along with the average of the last 10 items.

Unfortunately the code fails with:

Traceback (most recent call last):
  File "avg_test.py", line 6, in <module>
    avg10 = np.average(data[:-10]['value'])
TypeError: list indices must be integers or slices, not str

as I'm not sure how I can access 10 last value items (or less items if all 10 are not available) from the list of dictionaries and pass it into numpy's average function.

So for example I expect the output something like:

[{'value': 11, 'avg10': 11}, {'value': 62, 'avg10': 36.5}, {'value': 56, 'avg10': 43}, {'value': 48, 'avg10': 44.25}, {'value': 43, 'avg10': 41.8}]

and so on.

Where avg10 is the average of last (at least) 10 items relatively from the current one (ideally including the current one, but doesn't have to). If there is only one previous element, then it's the average of 1 element, if two - it's average of two value items, and so on (with average of maximum last 10 items).

What would be the correct syntax in this case?

Upvotes: 1

Views: 3319

Answers (5)

mquantin
mquantin

Reputation: 1158

To keep your dict solution and avoid problem with the first slices you can do the following.

Your line:

avg10 = np.average(data[:-10]['value']) 

should be:

avg10 = np.average([data[j]['value'] if j>=0 else value for j in range(i-10, i) ])

But this will cause your mean NOT BEING the average of the 10 previous results, as there is no 10 previous results available...
Note: I choose that the first value has for mean itself but the second has for mean the 10 last available ones, so the only last one. So the first two values have themself as mean. This is strange. You can decide to change this behaviour with

avg10 = np.average([data[j]['value'] if j>=0 else firstAverage for j in range(i-10, i) ])

If you decide to include the value itself within the last 10 for the average (i.e. dict of {value; mean of the last 9 items and the value itself}) then there is no more exception for the first item (which has no previous item) and you can do:

for i in range(0, 100):
    value = random.randrange(100)
    lasts = [data[j]['value'] for j in range(i-9, i) if j>=0]
    lasts.append(value)
    avg10 = np.average(lasts)
    data += [{'value': value, 'avg10': avg10}]

In this last case you can edit your question to more precise ;)

Upvotes: 2

kenorb
kenorb

Reputation: 166677

Here is the complete solution where average also takes into account the current value:

import random
import numpy as np
data = []
for i in range(0, 200):
    value = random.randrange(100)
    avg10 = np.average([x['value'] for x in data[-min(len(data), 10):]] + [value])
    data += [{'value': value, 'avg10': avg10}]

Basically np.average() accepts array containing data to be averaged, so list of dictionaries needs to be converted into flat list using list comprehensions. For the range, -min(len(data), 10): is used to fetch the last 10 items or less depending on the current size of data.


To understand the above sample more easily, here is the simpler helper code:

>>> data = []
>>> for i in range(0, 10):
...     index = -min(len(data), 5)
...     data += [i]
...     print(i, index, data[index:])
... 
0 0 [0]
1 -1 [1]
2 -2 [1, 2]
3 -3 [1, 2, 3]
4 -4 [1, 2, 3, 4]
5 -5 [1, 2, 3, 4, 5]
6 -5 [2, 3, 4, 5, 6]
7 -5 [3, 4, 5, 6, 7]
8 -5 [4, 5, 6, 7, 8]
9 -5 [5, 6, 7, 8, 9]

Upvotes: 2

aprasanth
aprasanth

Reputation: 1099

try this

import random
import numpy as np

data = []
for i in range(0, 100):
    value = random.randrange(100)
    avg10 = np.average(range(value+1)[-10:]) if value !=0 else 0
    data.append({'value': value, 'avg10': avg10})
print(data)

Upvotes: 0

Sina
Sina

Reputation: 215

import random
import numpy as np
data = []
for i in range(0, 100):
    values = np.random.uniform(0, 100, size=100)
    value = random.randrange(100)
    avg10 = np.average(values[max(value-9,0):value+1])
    data += [{'value': value, 'avg10': avg10}]

Upvotes: 0

hoyland
hoyland

Reputation: 1824

The error message gives a good hint for where to look: "list indices must be integers or slices, not str". In other words, we have to look for somewhere we're using a string as an index of a list.

data is a list of dicts. Therefore, data[:-10] is also a list of dicts, meaning data[:-10]['value'] doesn't make sense. You want something like [x['value'] for x in data[:-10]] instead, iterating over the list of dicts.

Upvotes: 0

Related Questions