tionichm
tionichm

Reputation: 163

Duplicating specific elements in lists or Numpy arrays

I work with large data sets in my research.

I need to duplicate an element in a Numpy array. The code below achieves this, but is there a function in Numpy that performs the operation in a more efficient manner?

"""
Example output
>>> (executing file "example.py")
Choose a number between 1 and 10:
2
Choose number of repetitions:
9
Your output array is:
 [1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6, 7, 8, 9, 10]

>>>
"""
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

y = int(input('Choose the number you want to repeat (1-10):\n'))
repetitions = int(input('Choose number of repetitions:\n'))
output = []

for i in range(len(x)):
    if x[i] != y:
        output.append(x[i])
    else:
        for j in range(repetitions):
            output.append(x[i])

print('Your output array is:\n', output)

Upvotes: 1

Views: 57

Answers (2)

mrk
mrk

Reputation: 10366

There is the numpy.repeat function:

>>> np.repeat(3, 4)
array([3, 3, 3, 3])

>>> x = np.array([[1,2],[3,4]])

>>> np.repeat(x, 2)
array([1, 1, 2, 2, 3, 3, 4, 4])

>>> np.repeat(x, 3, axis=1)
array([[1, 1, 1, 2, 2, 2],
       [3, 3, 3, 4, 4, 4]])

>>> np.repeat(x, [1, 2], axis=0)
array([[1, 2],
       [3, 4],
       [3, 4]])

Upvotes: 0

Divakar
Divakar

Reputation: 221524

One approach would be to find the index of the element to be repeated with np.searchsorted. Use that index to slice the left and right sides of the array and insert the repeated array in between.

Thus, one solution would be -

idx = np.searchsorted(x,y)
out = np.concatenate(( x[:idx], np.repeat(y, repetitions), x[idx+1:] ))

Let's consider a bit more generic sample case with x as -

x = [2, 4, 5, 6, 7, 8, 9, 10]

Let the number to be repeated is y = 5 and repetitions = 7.

Now, use the proposed codes -

In [57]: idx = np.searchsorted(x,y)

In [58]: idx
Out[58]: 2

In [59]: np.concatenate(( x[:idx], np.repeat(y, repetitions), x[idx+1:] ))
Out[59]: array([ 2,  4,  5,  5,  5,  5,  5,  5,  5,  6,  7,  8,  9, 10])

For the specific case of x always being [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], we would have a more compact/elegant solution, like so -

np.r_[x[:y-1], [y]*repetitions, x[y:]]

Upvotes: 2

Related Questions