Cheshie
Cheshie

Reputation: 2837

Efficient and not memory consuming way to find all possible pairs in list

I have a dictionary called lemma_all_context_dict, and it has approximately 8000 keys. I need a list of all possible pairs of these keys.

I used:

pairs_of_words_list = list(itertools.combinations(lemma_all_context_dict.keys(), 2)) 

However, when using this line I get a MemoryError. I have 8GB of RAM but perhaps I get this error anyway because I've got a few very large dictionaries in this code.

So I tried a different way:

pairs_of_words_list = []
for p_one in range(len(lemma_all_context_dict.keys())):
        for p_two in range(p_one+1,len(lemma_all_context_dict.keys())):
                pairs_of_words_list.append([lemma_all_context_dict.keys()[p_one],lemma_all_context_dict.keys()[p_two]])

But this piece of codes takes around 20 minutes to run... does anyone know of a more efficient way to solve the problem? Thanks

**I don't think that this question is a duplicate because what I'm asking - and I don't think this has been asked - is how to implement this stuff without my computer crashing :-P

Upvotes: 2

Views: 116

Answers (1)

Pierre
Pierre

Reputation: 6237

Don't build a list, since that's the reason you get a memory error (you even create two lists, since that's what .keys() does). You can iterate over the iterator (that's their purpose):

for a, b in itertools.combinations(lemma_all_context_dict, 2):
    print a, b

Upvotes: 2

Related Questions