Reputation: 297
i need to check for duplicate values that might occur in a dictionary. I have a dictionary in the following layout. Any advise is welcome! thanks so much
the original dictionary
dic = {'ab1': [{'ans': 'Male', 'val': '1'},
{'ans': 'Female', 'val': '2'},
{'ans': 'Other', 'val': '3'},
{'ans': 'Prefer not to answer', 'val': '3'}],
'bc1': [{'ans': 'Employed', 'val': '1'},
{'ans': 'Unemployed', 'val': '2'},
{'ans': 'Student', 'val': '3'},
{'ans': 'Retired', 'val': '4'},
{'ans': 'Part-time', 'val': '5'},
{'ans': 'Prefer not to answer', 'val': '7'}],
'bc2': [{'ans': 'Mother',
'val': '1'},
{'ans': 'Father ', 'val': '2'},
{'ans': 'Brother', 'val': '3'},
{'ans': 'Sister', 'val': '4'},
{'ans': 'Grandmother', 'val': '4'},
{'ans': 'Grandfather', 'val': '6'},
{'ans': 'Son', 'val': '7'},
{'ans': 'Daughter', 'val': '8'}]}
the expected output - a list that contains ONLY items with identical values per key - so only this
ab1: Other 3, Prefer not to answer 3
bc2: Sister 4, Grandmother 4
code I have tried it aims to reverse the dictionary first - but throws unhashable type list error i think because it treats it as a list when in fact the dict might be a tupple but i don't know how to change it
rev_dict = {}
for k, v in dic.items():
rev_dict.setdefault(v, set()).add(k)
res = set(chain.from_iterable(v for k, v in rev_dict.items()
if len(v) > 1))
Upvotes: 4
Views: 734
Reputation: 2283
The panda's answer is certainly nicer:
lst = []
for i in dic.keys():
counts = Counter([j['val'] for j in dic[i]])
new = {j['ans']: j['val'] for j in dic[i] if counts[j['val']] > 1}
lst.append(i + ': ' + ', '.join(['{} {}'.format(i, new[i]) for i in new])) if new else None
Upvotes: 1
Reputation: 78650
You've not specified an exact output format, but since you tagged pandas
, here's a pandas
solution.
import pandas as pd
{k: pd.DataFrame(v)[lambda df: df['val'].duplicated(keep=False)].to_dict(orient='records') for k, v in dic.items()}
Output:
{
'ab1': [{'ans': 'Other', 'val': '3'},
{'ans': 'Prefer not to answer', 'val': '3'}],
'bc1': [],
'bc2': [{'ans': 'Sister', 'val': '4'}, {'ans': 'Grandmother', 'val': '4'}]
}
Upvotes: 2
Reputation: 825
Import itertools and try this:
list(itertools.chain(*[[(k, i['ans'],i['val']) for i in v] for k, v in dic.items()]))
Long version
import itertools
lst = []
for k,v in dic.items():
for i in v:
tup = (k, i['ans'],i['val'])
lst.append(tup)
list(itertools.chain(*lst))
Upvotes: 0