Reputation: 188014
I would like to create a range (e.g. (1, 5)) of numbers with some repetitions (e.g. 4):
[1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]
One way would be to write:
list(itertools.chain(*([x] * 4 for x in range(1, 5))))
Or similarly:
list(itertools.chain(*(itertools.repeat(x, 4) for x in range(1, 5))))
However, there is a flatting step, which could be avoided.
Is there a more pythonic or more compact version to generate such a sequence?
Upvotes: 8
Views: 4448
Reputation: 152647
One option, although it requires installing a package, would be itertation_utilities.replicate
:
>>> from iteration_utilities import replicate
>>> list(replicate(range(1, 5), 4))
[1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]
In case you don't want to install that package the replicate
function is essentially equivalent to this function:
from itertools import repeat
def replicate(items, repeats):
for item in items:
for _ in repeat(None, repeats):
yield item
Just in case you're interested in the performance I did some micro-benchmarks for several (not all) of the proposed alternatives:
As you can see the NumPy and iteration_utilities
approaches were
fastest, while all the other approaches were roughly equally fast.
It's kind of interesting to note that of these other approaches the list.extend
approach was fastest and (my) custom generator slowest. I didn't expect that.
And here is the code to replicate the benchmark:
from iteration_utilities import replicate
from itertools import chain, repeat
import numpy as np
def replicate_generator_impl(upper):
for item in range(1, upper):
for _ in repeat(None, 4):
yield item
def replicate_generator(upper):
return list(replicate_generator_impl(upper))
def iteration_utilities_replicate(upper):
return list(replicate(range(1, upper), 4))
def double_comprehension(upper):
return [i for i in range(1, upper) for _ in range(4)]
def itertools_chain(upper):
return list(chain(*([x] * 4 for x in range(1, upper))))
def itertools_chain_from_iterable(upper):
return list(chain.from_iterable(repeat(i, 4) for i in range(1, upper)))
def extend(upper):
a = []
for i in range(1, upper):
a.extend([i] * 4)
return a
def numpy_repeat(upper):
return np.repeat(np.arange(1, upper), 4)
from simple_benchmark import benchmark
funcs = [replicate_generator, iteration_utilities_replicate, double_comprehension, itertools_chain, itertools_chain_from_iterable, extend, numpy_repeat]
arguments = {2**i: 2**i for i in range(1, 15)}
b = benchmark(funcs, arguments, argument_name='size')
b.plot()
In case you were wondering how it would look like without the NumPy approach:
Disclaimer: I'm the author of iteration_utilities
and simple_benchmark
.
Upvotes: 1
Reputation: 164653
Nothing wrong with your solution. But you can use chain.from_iterable
to avoid the unpacking step.
Otherwise, my only other recommendation is NumPy, if you are happy to use a 3rd party library.
from itertools import chain, repeat
import numpy as np
# list solution
res = list(chain.from_iterable(repeat(i, 4) for i in range(1, 5)))
# NumPy solution
arr = np.repeat(np.arange(1, 5), 4)
Upvotes: 8
Reputation: 25094
I just wanted to mention, that extend
might be an option too. Maybe not as beautiful as a one liner list comprehension, but it will perform better when the size of the buckets
increase
def listExtend():
a = []
for i in range(1,5):
a.extend([i]*4)
return a
def listComprehension():
return [[i,x] for i in range(1, 5) for x in range(4)]
import timeit
print(timeit.timeit(stmt="listComprehension()", setup="from __main__ import listComprehension", number=10**7))
print(timeit.timeit(stmt="listExtend()", setup="from __main__ import listExtend", number=10**7))
14.2532608
8.78004566
Upvotes: 1
Reputation: 48357
You can just use a list comprehension instead.
l = [i for i in range(1, 5) for _ in range(4)]
Output
[1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]
Upvotes: 11
Reputation: 11192
try this,
range(1,5)*4 # if you don't consider order
sorted(range(1,5)*4) # for ordered seq
With performance updated.
Mihai Alexandru-Ionut Answer:
%timeit [i for i in range(1, 5) for _ in range(4)]
1000000 loops, best of 3: 1.91 µs per loop
jpp answer:
%timeit list(chain.from_iterable(repeat(i, 4) for i in range(1, 5)))
100000 loops, best of 3: 2.12 µs per loop
%timeit np.repeat(np.arange(1, 5), 4)
1000000 loops, best of 3: 1.68 µs per loop
Rory Daulton answer:
%timeit [n for n in range(1,5) for repeat in range(4)]
1000000 loops, best of 3: 1.9 µs per loop
jedwards answer:
%timeit list(i//4 for i in range(1*4, 5*4))
100000 loops, best of 3: 2.47 µs per loop
RoadRunner Suggested in comment section:
%timeit for i in range(1, 5): lst.extend([i] * 4)
1000000 loops, best of 3: 1.46 µs per loop
My answer:
%timeit sorted(range(1,5)*4)
1000000 loops, best of 3: 1.3 µs per loop
Upvotes: 4
Reputation: 22544
I'm a big fan of code being simple and easy to understand. With that philosophy, I would use
[n for n in range(1,5) for repeat in range(4)]
Upvotes: 2
Reputation: 30210
I think chain
+ repeat
is likely your best bet. That being said,
start = 1
stop = 5
repeat = 4
x = list(i//repeat for i in range(start*repeat, stop*repeat))
print(x)
Should work (for positive args, at least).
Upvotes: 3