Reputation: 10953
Suppose I have an iterator
numbers = iter(range(100))
and I want to count consecutive mean values and store them in iterable with elements
0., 0.5, ..., 49., 49.5
this could be done by converting iterable to list
/tuple
and counting its slices like
from statistics import mean
# in cases with large or potentially infinite amounts of data
# this conversion will fail
numbers_list = list(numbers)
numbers_slices = (numbers_list[:end + 1] for end in range(len(numbers_list)))
mean_values = map(mean, numbers_slices)
(more info about mean
function at docs)
So my question is more general: is there any way to get consecutive slices of iterable using standard library without wrapping with list
/tuple
?
We can write utility function like
def get_slices(iterable):
elements = []
for element in iterable:
elements.append(element)
yield elements
and then
numbers_slices = get_slices(numbers)
mean_values = map(mean, numbers_slices)
but it also looks awful
P. S.: I know that it will be better to count consecutive mean values like
def get_mean_values(numbers):
numbers_sum = 0
for numbers_count, number in enumerate(numbers, start=1):
numbers_sum += number
yield numbers_sum / numbers_count
but it is not what I am talking about.
Upvotes: 2
Views: 1114
Reputation: 10953
it seems like there is no standard way of getting consecutive slices of iterable (iterator/list
/tuple
/etc)
so better way i've found out is to use a bit modified utility function from original question
def consecutive_slices(iterable):
elements = []
for element in iterable:
elements.append(element)
yield list(elements)
Modifications:
added copying of elements
(btw there are many ways of doing that), because previous version in case of wrapping in list
>>> numbers_slices = list(get_slices(numbers))
will give us list
with N
repititions of elements
with all numbers in them (N
equals to 100 in example):
>>> numbers_slices == [list(range(100))] * 100
True
After writing a bit more I realized that this can also be done using itertools
module like
from itertools import (accumulate,
chain)
def consecutive_slices(iterable):
def collect_elements(previous_elements, element):
return previous_elements + [element]
return accumulate(chain(([],), iterable), collect_elements)
here we are prepending empty list
using chain
as initial slice, which can be ignored in result using islice
like
from itertools import islice
...
islice(consecutive_slices(range(10)), 1, None)
but it seems legit to leave it as one slices since empty slice is also a slice afterall.
In comparison with previous solution this is still 4-lines-of-code function that does nearly the same thing, but less "spaghetti" IMO.
Upvotes: 2
Reputation: 42678
Take a look at itertools.islice
Link
import itertools
def get_slices(iterable):
return map(lambda x: itertools.islice(iterable, x), xrange(len(iterable)))
If you don't know the length, here you have a reduce version, highly ineficcient in memory:
from functools import reduce
numbers = (number for number in range(1,100))
mean = lambda x, y: (x+y)/float(2)
reduce(lambda x, y: x + [mean(x[-1], y)], numbers, [0])
[0.0, 0.5, 1.25, 2.125, 3.0625, 4.03125, 5.015625, 6.0078125, 7.00390625, 8.001953125, 9.0009765625, 10.00048828125, 11.000244140625, 12.0001220703125, 13.00006103515625, 14.000030517578125, 15.000015258789062, 16.00000762939453, 17.000003814697266, 18.000001907348633, 19.000000953674316, 20.000000476837158, 21.00000023841858, 22.00000011920929, 23.000000059604645, 24.000000029802322, 25.00000001490116, 26.00000000745058, 27.00000000372529, 28.000000001862645, 29.000000000931323, 30.00000000046566, 31.00000000023283, 32.000000000116415, 33.00000000005821, 34.000000000029104, 35.00000000001455, 36.000000000007276, 37.00000000000364, 38.00000000000182, 39.00000000000091, 40.000000000000455, 41.00000000000023, 42.000000000000114, 43.00000000000006, 44.00000000000003, 45.000000000000014, 46.00000000000001, 47.0, 48.0, 49.0, 50.0, 51.0, 52.0, 53.0, 54.0, 55.0, 56.0, 57.0, 58.0, 59.0, 60.0, 61.0, 62.0, 63.0, 64.0, 65.0, 66.0, 67.0, 68.0, 69.0, 70.0, 71.0, 72.0, 73.0, 74.0, 75.0, 76.0, 77.0, 78.0, 79.0, 80.0, 81.0, 82.0, 83.0, 84.0, 85.0, 86.0, 87.0, 88.0, 89.0, 90.0, 91.0, 92.0, 93.0, 94.0, 95.0, 96.0, 97.0, 98.0]
So, in the end we are doing almost the same as your code, so you should go with it or use a list instead of a generator ande then use slices of the list (itertools.ilice
).
EDIT:
I've been thinking about this, it was easily solved with Haskell
scanl
, so I generilezed the concept and get a very good result:
def scanl(f, g):
n = next(g)
yield n
for e in g:
n = f(n, e)
yield n
list(scanl(mean, number))
[0, 0.5, 1.25, 2.125, 3.0625, 4.03125, 5.015625, 6.0078125, 7.00390625, 8.001953125, 9.0009765625, 10.00048828125, 11.000244140625, 12.0001220703125, 13.00006103515625, 14.000030517578125, 15.000015258789062, 16.00000762939453, 17.000003814697266, 18.000001907348633, 19.000000953674316, 20.000000476837158, 21.00000023841858, 22.00000011920929, 23.000000059604645, 24.000000029802322, 25.00000001490116, 26.00000000745058, 27.00000000372529, 28.000000001862645, 29.000000000931323, 30.00000000046566, 31.00000000023283, 32.000000000116415, 33.00000000005821, 34.000000000029104, 35.00000000001455, 36.000000000007276, 37.00000000000364, 38.00000000000182, 39.00000000000091, 40.000000000000455, 41.00000000000023, 42.000000000000114, 43.00000000000006, 44.00000000000003, 45.000000000000014, 46.00000000000001, 47.0, 48.0, 49.0, 50.0, 51.0, 52.0, 53.0, 54.0, 55.0, 56.0, 57.0, 58.0, 59.0, 60.0, 61.0, 62.0, 63.0, 64.0, 65.0, 66.0, 67.0, 68.0, 69.0, 70.0, 71.0, 72.0, 73.0, 74.0, 75.0, 76.0, 77.0, 78.0, 79.0, 80.0, 81.0, 82.0, 83.0, 84.0, 85.0, 86.0, 87.0, 88.0, 89.0, 90.0, 91.0, 92.0, 93.0, 94.0, 95.0, 96.0, 97.0, 98.0]
Upvotes: 0
Reputation: 9988
You could have an generator that directly yield
s the means, with local variables containing the running total and count. (Actually you could get the count for free by iterating over enumerate(iterable)
and adding 1
to the index. Is that enough of a hint?
Upvotes: 1