kasperhj
kasperhj

Reputation: 10492

Create a list with items from another list at indices specified in a third list

Consider two lists:

a = [2, 4, 5]
b = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

I want a resulting list c where

c = [0, 0, 2, 0, 4, 5, 0 ,0 ,0 ,0]

is a list of length len(b) with values taken from b defined by indices specified in a and zeros elsewhere.

What is the most elegant way of doing this?

Upvotes: 1

Views: 455

Answers (2)

Ffisegydd
Ffisegydd

Reputation: 53738

Use a list comprehension with the conditional expression and enumerate.

This LC will iterate over the index and the value of the list b and if the index i is found within a then it will set the element to v, otherwise it'll set it to 0.

a = [2, 4, 5]
b = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

c = [v if i in a else 0 for i, v in enumerate(b)]

print(c)
# [0, 0, 2, 0, 4, 5, 0, 0, 0, 0]

Note: If a is large then you may be best converting to a set first, before using in. The time complexity for using in with a list is O(n) whilst for a set it is O(1) (in the average case for both).

The list comprehension is roughly equivalent to the following code (for explanation):

c = []
for i, v in enumerate(b):
    if i in a:
        c.append(v)
    else:
        c.append(0)

As you have the option of using numpy I've included a simple method below which uses initialises an array filled with zeros and then uses list indexing to replace the elements.

import numpy as np

a2 = np.array(a)
b2 = np.array(b)

c = np.zeros(len(b2))
c[a2] = b[a2]

When timing the three methods (my list comp, my numpy, and Jon's method) the following results are given for N = 1000, a = list(range(0, N, 10)), and b = list(range(N)).

In [170]: %timeit lc_func(a,b)
100 loops, best of 3: 3.56 ms per loop

In [171]: %timeit numpy_func(a2,b2)
100000 loops, best of 3: 14.8 µs per loop

In [172]: %timeit jon_func(a,b)
10000 loops, best of 3: 22.8 µs per loop

This is to be expected. The numpy function is fastest, but both Jon's function and the numpy are much faster than a list comprehension. If I increased the number of elements to 100,000 then the gap between numpy and Jon's method gets even larger.

Interestingly enough though, for small N Jon's function is the best! I suspect this is to do with the overhead of creating numpy arrays being trumped by the overhead of lists.

Moral of the story: large N? Go with numpy. Small N? Go with Jon.

Upvotes: 8

Jon Clements
Jon Clements

Reputation: 142256

The other option is to pre-initialise the target list with 0s - a fast operation, then over-write the value to the suitable index, eg:

a = [2, 4, 5]
b = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

c = [0] * len(b)
for el in a:
    c[el] = b[el]

# [0, 0, 2, 0, 4, 5, 0, 0, 0, 0]

Upvotes: 5

Related Questions