Alexandra
Alexandra

Reputation: 35

How to make comparisons between two large lists fast python

I would like to know if there's any way to make this code run more fast. It's taking me 47 seconds and it has to compare everything not just the elements in the same position.

pixels = list(mensagem)
arrayBits = []
for i in pixels:
    for j in tabela:
        if i == j[0]:
            arrayBits.append(j[1])

Here's the hole code but I think the only reason why it's taking so long is the one I asked. Sorry about my english, I'm portuguese.

def codifica(mensagem, tabela, filename):
tamanho = np.shape(mensagem)
largura = tamanho[0]
if len(tamanho)==2:
    altura = tamanho[1]
else:
    altura = 0

pixels = list(mensagem)
arrayBits = []
for i in pixels:
    for j in tabela:
        if i == j[0]:
            arrayBits.append(j[1])

arraySemVirgulas = np.array(arrayBits).ravel() # tirar as virgulas
arrayJunto = ''.join(arraySemVirgulas) # juntar todos os bits
array = list(map(int,arrayJunto)) # coloca-los numa lista
count = 0
while(len(array)%8!=0):
    array.append(0)
    count += 1

array = np.array(array)
arrayNovo = array.reshape(-1,8)

decimais = convBi(arrayNovo)
array_char = ['' for i in range(len(decimais)+5)]
j = 2
for i in decimais:
    a = chr(i)
    array_char[j] = a
    j += 1

array_char[0] = str(count) 
array_char[1] = str(len(str(largura))) 
array_char[2] = str(len(str(altura)))
array_char[3] = str(largura) 
array_char[4] = str(altura)

ficheiro = open(filename,"wb")
for i in array_char:
    ficheiro.write(i)
ficheiro.close()

Upvotes: 0

Views: 201

Answers (2)

m.wasowski
m.wasowski

Reputation: 6386

Using set() and dict() based containers can make it linear in time instead of O(n^2). This should speed it up:

EDIT simpler and probably faster version:

import itertools

# set of 'keys' that exist in both
keys = set(pixels) & set(el[0] for el in tabela)
# and generator comprehension with fast lookup 
elements = (element[1] for element in tabela 
        if element[0] in keys)
# this will flatten inner lists and create a list with result:
result = list(itertools.chain.from_iterable(elements))

Only two runs through tabela, both with time complexity O(n).

If pixels are not unique, and corresponding values from tabela should be multiplicated for every occurence of pixel, this should be used:

import itertools

# set of 'keys' that exist in both
keys = set(pixels) & set(el[0] for el in tabela)
# and generator comprehension with fast lookup 
elements = lambda key: tabela[key][1] if key in keys else []
# this will flatten inner lists and create a list with result:
flatten = itertools.chain.from_iterable
result = list(flatten(elements(pixel) for pixel in pixels))

Upvotes: 0

lenik
lenik

Reputation: 23556

this might be faster if you replace the iteration

for i in pixels:
    for j in tabela:
        if i == j[0]:
            arrayBits.append(j[1])

with a dictionary lookup

tabela_dict = dict(tabela)
for i in pixels:
    if i in tabela_dict :
        arrayBits.append(tabela_dict[i])

Upvotes: 1

Related Questions