Reputation: 4808
I have two different lists and I need extract data from them according their name and then multiply them.
I have this lists:
query_tfidf = [0.8465735902799727, 0.8465735902799727]
documents_query = [['Aftonbladet', 'play', 0.0], ['Aftonbladet', 'free', 0.0],
['Radiosporten Play', 'play', 0.10769448286014331], ['Radiosporten Play', 'free', 0.0]]
And I need sort them according their name, for example:
{Aftonbladet: {play: 0.0, free: 0.0}, Radiosporten Play: {play: 0.10769448286014331, free: 0.0}
Then I need to extract data from each and multiply with query_tfidf
and compute two variables. For example:
for each name:
dot_product = (play_value * query_tfidf[0]) + (free_value * query_tfidf[1])
query = sqrt((query_tfidf[0])^2 + (query_tfidf[1])^2)
document = sqrt((play_value)^2 + (free_value)^2)
I'm a little bit desperate so I want to ask here. I'm using python 2.7.
Upvotes: 0
Views: 79
Reputation: 7146
Sorting the entries in your documents_query
according to their name and keyword is very straightforward using dictionaries:
indexedValues = {}
for entry in documents_query:
if entry[0] not in indexedValues:
indexedValues[entry[0]] = {}
indexedValues[entry[0]][entry[1]] = entry[2]
This will give you indexedValues
that looks like what you asked for:
{'Aftonbladet': {'play': 0.0, 'free': 0.0}, 'Radiosporten Play': {'play': 0.10769448286014331, 'free': 0.0}
Upvotes: 1
Reputation: 6387
Use collections.defaultdict
to aggregate your data
from collections import defaultdict
results = defaultdict(dict)
for main_key, key, value in documents_query:
results[main_key][key] = value
# dict(results)
# Out[16]:
# {'Aftonbladet': {'free': 0.0, 'play': 0.0},
# 'Radiosporten Play': {'free': 0.0, 'play': 0.10769448286014331}}
What you are going to do with it later is bit unclear... but you should figure it out yourself, right?
Upvotes: 1