ConstantLearner
ConstantLearner

Reputation: 15

Grouping similar values in a dictionary

I'm new to programming and would appreciate if someone can help with the following in Python/Pandas. I have a dictionary that has a list as the values. I'd like to be able to group together keys that have similar values. I've seen similar questions on here, but the catch in this case is i want to disregard the order of the values for example:

classmates={'jack':['20','male','soccer'],'brian':['26','male','tennis'],'charles':['male','soccer','20'],'zulu':['19','basketball','male']}

jack and charles have the same values but in different order. I'd like an output that will give the value irrespective of order. In this case, the output would be written to a csv as

['20','male','soccer']: jack, charles
['26','male','tennis']: brian
['19','basketball','male']: zulu

Upvotes: 0

Views: 135

Answers (5)

Aaditya Ura
Aaditya Ura

Reputation: 12679

You could do this in one line:

print({tuple(sorted(v)) : [k for k,vv in a.items() if sorted(vv) == sorted(v)] for v in a.values()})

or

Here is detailed solution :

dict_1 = {'jack': ['20', 'male', 'soccer'], 'brian': ['26', 'male', 'tennis'], 'charles': ['male', 'soccer', '20'],
     'zulu': ['19', 'basketball', 'male']}

sorted_dict = {}
for key,value in dict_1.items():
    sorted_1 = sorted(value)
    sorted_dict[key] = sorted_1

tracking_of_duplicate = []
final_dict = {}
for key1,value1 in sorted_dict.items():
    if value1 not in tracking_of_duplicate:
        tracking_of_duplicate.append(value1)
        final_dict[tuple(value1)] = [key1]

    else:

        final_dict[tuple(value1)].append(key1)

print(final_dict)

Upvotes: 0

tkhurana96
tkhurana96

Reputation: 939

from collections import defaultdict

ans = defaultdict(list)

classmates={'jack':['20','male','soccer'],
            'brian':['26','male','tennis'],
            'charles':['male','soccer','20'],
            'zulu':['19','basketball','male']
           }


for k, v in classmates.items():
    sorted_tuple = tuple(sorted(v))
    ans[sorted_tuple].append(k)

# ans is: a dict you desired
# defaultdict(<class 'list'>, {('20', 'male', 'soccer'): ['jack','charles'],
# ('26', 'male', 'tennis'): ['brian'], ('19', 'basketball', 'male'): ['zulu']})

for k, v in ans.items():
    print(k, ':', v)

# output: 
# ('20', 'male', 'soccer') : ['jack', 'charles']
# ('26', 'male', 'tennis') : ['brian']
# ('19', 'basketball', 'male') : ['zulu']

Upvotes: 1

Ashutosh Chapagain
Ashutosh Chapagain

Reputation: 926

First of all convert your dictionary to a pandas dataframe.

df= pd.DataFrame.from_dict(classmates,orient='index')

Then sort it in ascending order by age.

df=df.sort_values(by=0,ascending=True)

Here 0 is a default column name. You can rename this column name.

Upvotes: 0

cs95
cs95

Reputation: 403050

Using frozensets, apply, groupby + agg:

s = pd.DataFrame(classmates).T.apply(frozenset, 1)

s2 = pd.Series(s.index.values, index=s)\
          .groupby(level=0).agg(lambda x: list(x))

s2
(soccer, 20, male)        [charles, jack]
(26, male, tennis)                [brian]
(basketball, male, 19)             [zulu]
dtype: object

Upvotes: 2

Jeremy McGibbon
Jeremy McGibbon

Reputation: 3785

You can invert the dictionary in the way you want with the following code:

classmates={'jack':['20','male','soccer'],'brian':['26','male','tennis'],'charles':['male','soccer','20'],'zulu':['19','basketball','male']}

out_dict = {}
for key, value in classmates.items():
    current_list = out_dict.get(tuple(sorted(value)), [])
    current_list.append(key)
    out_dict[tuple(sorted(value))] = current_list

print(out_dict)

This prints

{('20', 'male', 'soccer'): ['charles', 'jack'], ('26', 'male', 'tennis'): ['brian'], ('19', 'basketball', 'male'): ['zulu']}

Upvotes: 1

Related Questions