Reputation: 1445
So, i have a list of dicts but I am trying to move these values over to a sparse matrix:
matrix = [[0]*large for i in xrange(small)] #large is like 150k and small is about 10k
So, to get the values into this, I have a dict of all unique keys, the length of which is equal to large, and this determines where the values mapped to the keys in the individual dicts will go based on the index, so:
for i in range(len(lst)):
dic = lst[i]
for key in dic.keys():
vectors[i][ordering.index(key)] = dic.get(key, 0)
But this is taking a very long time to do. Like.... forever. (Ordering is a dict which is a merged copy of all the dicts... but I want the data as a sparse matrix, not as a dict but I am using this for knowing the index of keys I should use for the sparse matrix)
Upvotes: 0
Views: 40
Reputation: 2181
ordering
is probably the cause.
Based on the way you use it, ordering is just a list, but ordering.index(key)
has to do a full scan of the list to find the index.
Change it to a dictionary, with the intended index as the value, like this:
ordering_dict = {}
for i in range(len(ordering)):
ordering_dict[ordering[i]] = i
and then change your assignment to:
vectors[i][ordering[key]] = dic.get(key, 0)
This will remove the equivalent of an entire iteration of len(ordering) for each assignment, which means len(lst)*len(dic)*len(ordering) fewer total operations.
Upvotes: 1