Gabriel
Gabriel

Reputation: 42329

Find index of string in list

I have two lists with the same number of elements, all of them strings. These strings are the same set but in a different order in each list with no duplicates.

list_a = ['s1', 's2', 's3', 's4', 's5', ...]
list_b = ['s8', 's5', 's1', 's9', 's3', ...]

I need to go through each element in list_a and find the index in list_b that contains that same element. I can do this with two nested for loops but there has to be a better/more efficient way:

b_indexes = []
for elem_a in list_a:
    for indx_b, elem_b in enumerate(list_b):
        if elem_b == elem_a:
            b_indexes.append(indx_b)
            break

Upvotes: 2

Views: 202

Answers (4)

DSM
DSM

Reputation: 352959

An alternative approach to the index method is to build a dictionary of the locations in one pass instead of searching through the list each time. If the list is long enough, this should be faster, because it makes the process linear in the number of elements (on average) instead of quadratic. To be specific, instead of

def index_method(la, lb):
    return [lb.index(i) for i in la]

you could use

def dict_method(la, lb):
    where = {v: i for i,v in enumerate(lb)}
    return [where[i] for i in la]

This should be roughly comparable on small lists, albeit maybe a little slower:

>>> list_a = ['s{}'.format(i) for i in range(5)]
>>> list_b = list_a[:]
>>> random.shuffle(list_b)
>>> %timeit index_method(list_a, list_b)
1000000 loops, best of 3: 1.86 µs per loop
>>> %timeit dict_method(list_a, list_b)
1000000 loops, best of 3: 1.93 µs per loop

But it should be much faster on longer ones, and the difference will only grow:

>>> list_a = ['s{}'.format(i) for i in range(100)]
>>> list_b = list_a[:]
>>> random.shuffle(list_b)
>>> %timeit index_method(list_a, list_b)
10000 loops, best of 3: 140 µs per loop
>>> %timeit dict_method(list_a, list_b)
10000 loops, best of 3: 20.9 µs per loop

Upvotes: 2

John Zwinck
John Zwinck

Reputation: 249093

In functional style:

map(list_b.index, list_a)

A list will be produced containing the index in list_b of each element in list_a.

Upvotes: 3

Johannes Charra
Johannes Charra

Reputation: 29913

This should give you a list of the indexes.

[list_b.index(elem) for elem in list_a]

Upvotes: 2

TerryA
TerryA

Reputation: 59974

If there are no duplicates, you can just use list.index():

list_a = ['s1', 's2', 's3', 's4', 's5']
list_b = ['s8', 's5', 's1', 's9', 's3']
print [list_b.index(i) for i in list_a]

You only need to use one for loop, because you've said that the strings in list_a also appear in list_b, so there's no need to go if elem_b == elem_a: and iterate through the second list.

Upvotes: 4

Related Questions