daredevil1234
daredevil1234

Reputation: 1445

Optimizing something in Python

So, i have a list of dicts but I am trying to move these values over to a sparse matrix:

matrix = [[0]*large for i in xrange(small)] #large is like 150k and small is about 10k

So, to get the values into this, I have a dict of all unique keys, the length of which is equal to large, and this determines where the values mapped to the keys in the individual dicts will go based on the index, so:

for i in range(len(lst)):
dic = lst[i]
for key in dic.keys():
    vectors[i][ordering.index(key)] = dic.get(key, 0)

But this is taking a very long time to do. Like.... forever. (Ordering is a dict which is a merged copy of all the dicts... but I want the data as a sparse matrix, not as a dict but I am using this for knowing the index of keys I should use for the sparse matrix)

Upvotes: 0

Views: 40

Answers (1)

Matt Jordan
Matt Jordan

Reputation: 2181

ordering is probably the cause.

Based on the way you use it, ordering is just a list, but ordering.index(key) has to do a full scan of the list to find the index.

Change it to a dictionary, with the intended index as the value, like this:

ordering_dict = {}
for i in range(len(ordering)):
    ordering_dict[ordering[i]] = i

and then change your assignment to:

vectors[i][ordering[key]] = dic.get(key, 0)

This will remove the equivalent of an entire iteration of len(ordering) for each assignment, which means len(lst)*len(dic)*len(ordering) fewer total operations.

Upvotes: 1

Related Questions