neozen
neozen

Reputation: 17

Repeating indexes a given number of times

Basically, I have :

And I want to create an array containing the indexes I repeated N times, i.e. (1, 2, 2, 2) here, where 1 is repeated one time and 2 is repeated 3 times.

The best solution I've come up with uses the np.repeat and np.concatenate functions :

import numpy as np

list_index = np.arange(2)
list_no_repetition = [1, 3]

result = np.concatenate([np.repeat(index, no_repetition)
                         for index, no_repetition in zip(list_index, list_no_repetition)])
print(result)

I wonder if there is a "prettier"/"more efficient solution".

Thank you for your help.

Upvotes: 1

Views: 297

Answers (4)

ELinda
ELinda

Reputation: 2821

If by "efficiency" you mean speed, you can use timeit. Here are some results for some arbitrary, larger data.

First, define the functions and data:

# generate some data (list values/indices and number of reps)
N = 1000
li_2 = np.arange(N)
lnr_2 = np.random.randint(low=0, high=10, size=N)

# three functions produce the same result
def by_range(items, rep_cts):
    x = np.full(sum(rep_cts), np.nan)
    i = 0
    for val, reps in zip(items, rep_cts):
        x[i:i + reps] = val
        i = i + reps
    return x

def by_comp(items, reps):
    return np.array([val for val, rep in zip(items, reps) for i in range(rep)])

def by_cat(list_index, list_no_repetition):
    return np.concatenate([np.repeat(index, no_repetition)
                         for index, no_repetition in zip(list_index, list_no_repetition)])

About the same speed: first allocating an array and then filling it in, vs. doing a one-line double-for comprehension.

# 820 µs ± 11.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit by_range(li_2, lnr_2)
# 829 µs ± 4.26 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit by_comp(li_2, lnr_2)

Original method of concatenation is slightly slower:

# 2.19 ms ± 98.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit by_cat(li_2, lnr_2)

Note that the results will differ depending on where/how you run this, and the specific data you're dealing with.

Upvotes: 2

james-4sythe
james-4sythe

Reputation: 11

You could also use a dictionary with key as the index and the value as the amount of times repeated. I think that Andreas had it right with the list comprehension.

import numpy as np

repeatdict = {
    1:1,
    2:3,
    3:6
}

result = [x for key, value in repeatdict.items() for x in [key]*value]

print(result)

Upvotes: 1

Hello this is the alternative that I propose:

import numpy as np

list_index = np.arange(2)
list_no_repetition = [1, 3]

result = np.array([])
for i in range(len(list_index)):
  tempA=np.empty(list_no_repetition[i])
  tempA.fill(list_index[i])
  result = np.concatenate([result, tempA])

result

Upvotes: 1

Andreas
Andreas

Reputation: 9207

Not sure about prettier, but you could solve it completely with list comprehension:

[x for i,l in zip(list_index, list_no_repetition) for x in [i]*l]

Upvotes: 4

Related Questions