Reputation: 17
Basically, I have :
And I want to create an array containing the indexes I repeated N times, i.e. (1, 2, 2, 2) here, where 1 is repeated one time and 2 is repeated 3 times.
The best solution I've come up with uses the np.repeat and np.concatenate functions :
import numpy as np
list_index = np.arange(2)
list_no_repetition = [1, 3]
result = np.concatenate([np.repeat(index, no_repetition)
for index, no_repetition in zip(list_index, list_no_repetition)])
print(result)
I wonder if there is a "prettier"/"more efficient solution".
Thank you for your help.
Upvotes: 1
Views: 297
Reputation: 2821
If by "efficiency" you mean speed, you can use timeit
. Here are some results for some arbitrary, larger data.
First, define the functions and data:
# generate some data (list values/indices and number of reps)
N = 1000
li_2 = np.arange(N)
lnr_2 = np.random.randint(low=0, high=10, size=N)
# three functions produce the same result
def by_range(items, rep_cts):
x = np.full(sum(rep_cts), np.nan)
i = 0
for val, reps in zip(items, rep_cts):
x[i:i + reps] = val
i = i + reps
return x
def by_comp(items, reps):
return np.array([val for val, rep in zip(items, reps) for i in range(rep)])
def by_cat(list_index, list_no_repetition):
return np.concatenate([np.repeat(index, no_repetition)
for index, no_repetition in zip(list_index, list_no_repetition)])
About the same speed: first allocating an array and then filling it in, vs. doing a one-line double-for comprehension.
# 820 µs ± 11.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit by_range(li_2, lnr_2)
# 829 µs ± 4.26 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit by_comp(li_2, lnr_2)
Original method of concatenation is slightly slower:
# 2.19 ms ± 98.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit by_cat(li_2, lnr_2)
Note that the results will differ depending on where/how you run this, and the specific data you're dealing with.
Upvotes: 2
Reputation: 11
You could also use a dictionary with key as the index and the value as the amount of times repeated. I think that Andreas had it right with the list comprehension.
import numpy as np
repeatdict = {
1:1,
2:3,
3:6
}
result = [x for key, value in repeatdict.items() for x in [key]*value]
print(result)
Upvotes: 1
Reputation: 134
Hello this is the alternative that I propose:
import numpy as np
list_index = np.arange(2)
list_no_repetition = [1, 3]
result = np.array([])
for i in range(len(list_index)):
tempA=np.empty(list_no_repetition[i])
tempA.fill(list_index[i])
result = np.concatenate([result, tempA])
result
Upvotes: 1
Reputation: 9207
Not sure about prettier, but you could solve it completely with list comprehension:
[x for i,l in zip(list_index, list_no_repetition) for x in [i]*l]
Upvotes: 4