Reputation: 77
I have the following code:
from math import sqrt
from collections import Counter
def forSearch():
words = {'bit':{1:3,2:4,3:19,4:0},'shoe':{1:0,2:0,3:0,4:0},'dog':{1:3,2:0,3:4,4:5}, 'red':{1:0,2:0,3:15,4:0}}
search = {'bit':1,'dog':3,'shoe':5}
num_files = 4
file_relevancy = Counter()
c = sqrt(sum([x**2 for x in search.values()]))
for i in range(1, num_files+1):
words_ith_val = [words[x][i] for x in search.keys() ]
a = sum([search[key] * words[key][i] for key in search.keys()])
b = sqrt(sum([x**2 for x in words_ith_val]))
file_relevancy[i] = (a / (b * c))
return [x[0] for x in file_relevancy.most_common(num_files)]
print forSearch()
However, this has a problem with words which are contained in search but not in words:
I want to say something like this here:
for i in range(1, num_files+1):
if corresponding key in words cannot be found
insert it and make its value = 0
words_ith_val = [words[x][i] for x in search.keys() ]
Then it should work?
Unless anyone else has any better suggestions?
Upvotes: 0
Views: 134
Reputation: 2786
What about this code:
if key not in words:
words[key] = {k+1: 0 for k in range(num_files)}
In your code you can try to do
for key in search.keys():
if key not in words:
words[key] = {k+1: 0 for k in range(num_files)}
words_ith_val = [words[key][k + 1] for k in range(num_files)]
Upvotes: 0
Reputation: 14404
You can use the defaultdict:
from collections import defaultdict
d = defaultdict(int)
This will initialize a dictionary where the keys are created on access and the default value is 0. You can use other types as well:
defaultdict(dict)
defaultdict(list)
They will be initialized with an empty dictionary/list. You can also overwrite the default value with a factory method. See https://docs.python.org/2/library/collections.html#collections.defaultdict for details.
Upvotes: 2
Reputation: 799210
import collections
D = collections.defaultdict(int)
D['foo'] = 42
print D['foo'], D['bar']
Upvotes: 2