Optimizing execution-time to check if chars of a word are in a list python

Question

I am writing python2.7.15 code to access chars inside a word. How can I optimize this process, in order to check also if every word is contained inside an external list?

I have tried two versions of python2 code: version(1) is an extended version of what my code has to do, whereas in version (2) I tried a compact version of the same code.

chars_array = ['a','b','c']

VERSION (1)
def version1(word):
    chars =[x for x in word]
    count = 0

    for c in chars:
        if not c in chars_array:
            count+=1

    return count

VERSION (2)
def version2(word):
    return sum([1 for c in [x for x in word] if not c in chars_array])

I am analyzing a large corpus and for version1 I obtain an execution time of 8.56 sec, whereas for version2 it is 8.12 sec.

iz_ · Accepted Answer

The fastest solution (can be up to 100x faster for an extremely long string):

joined = ''.join(chars_array)
def version3(word):
    return len(word.translate(None, joined))

Another slower solution that is approximately the same speed as your code:

from itertools import ifilterfalse
def version4(word):
    return sum(1 for _ in ifilterfalse(set(chars_array).__contains__, word))

Timings (s is a random string):

In [17]: %timeit version1(s)
1000 loops, best of 3: 79.9 µs per loop

In [18]: %timeit version2(s)
10000 loops, best of 3: 98.1 µs per loop

In [19]: %timeit version3(s)
100000 loops, best of 3: 4.12 µs per loop # <- fastest

In [20]: %timeit version4(s)
10000 loops, best of 3: 84.3 µs per loop

Optimizing execution-time to check if chars of a word are in a list python

Answers (2)

Related Questions