Reputation: 163
I need to write a function that finds a couple of people with the most common hobbies, that is this couple should have the highest ratio of common hobbies to different hobbies. If multiple pairs have the same best ratio, it doesn't matter which pair is returned and the only exception is when multiple pairs share all of their hobbies, in which case the pair with the most shared hobbies is returned.
def find_two_people_with_most_common_hobbies(data: str) -> tuple:
new_dict = create_dictionary(data) # creates a dictionary in the form {name1: [hobby1, hobby2, ...], name2: [...]}
value_list = [] # list that stores all hobbies, duplicates included
for value in new_dict.items():
for ele in value[1]:
value_list.append(ele)
filtered_list = set([x for x in value_list if value_list.count(x) > 1]) # list where hobbies appear more than once, no duplicates
return tuple([k for k, v in new_dict.items() if set(v).intersection(filtered_list)])
So, given the input "John:running\nJohn:walking\nMary:dancing\nMary:running\nNora:running\nNora:singing\nNora:dancing"
, the output should be ('Mary', 'Nora')
. My code returns ('John', 'Mary', 'Nora')
, because it looks for an intersection between the values in the dictionary and what is in the filtered list. I don't understand how to make it return only shared hobbies.
Upvotes: 0
Views: 165
Reputation: 147146
I would do this as follows:
import itertools
dd = {'John': ['running', 'walking'], 'Mary': ['dancing', 'running'], 'Nora': ['running', 'singing', 'dancing' ]}
ss = { k : set(v) for k, v in dd.items() }
# {'John': {'walking', 'running'}, 'Mary': {'dancing', 'running'}, 'Nora': {'singing', 'running', 'dancing'}}
pp = [t for t in itertools.combinations(dd.keys(), 2)]
# [('John', 'Mary'), ('John', 'Nora'), ('Mary', 'Nora')]
hh = { (p1, p2) : (len(ss[p1] & ss[p2]), len(ss[p1] ^ ss[p2])) for p1, p2 in pp }
# {('John', 'Mary'): (1, 2), ('John', 'Nora'): (1, 3), ('Mary', 'Nora'): (2, 1)}
def most_shared(key):
try:
ratio = hh[key][0] / hh[key][1]
except ZeroDivisionError:
ratio = float('inf')
return (ratio, hh[key][0])
res = max(hh, key=most_shared)
# ('Mary', 'Nora')
Upvotes: 1
Reputation: 664
s = "John:running\nJohn:walking\nMary:dancing\nMary:running\nNora:running\nNora:singing\nNora:dancing"
d={}
for v in s.split('\n'):
k,v=v.split(':')
if k in d:
d[k].append(v)
else:
d[k]=[v]
for k1,v1 in d.items():
for k2,v2 in d.items():
if k1!=k2:
for v in v1:
if v not in v2:
break
else:
print(k1,k2)
Upvotes: 1