b101
b101

Reputation: 297

list duplicate values in a nested dictionary

i need to check for duplicate values that might occur in a dictionary. I have a dictionary in the following layout. Any advise is welcome! thanks so much

the original dictionary

dic = {'ab1': [{'ans': 'Male', 'val': '1'},
  {'ans': 'Female', 'val': '2'},
  {'ans': 'Other', 'val': '3'},
  {'ans': 'Prefer not to answer', 'val': '3'}],
 'bc1': [{'ans': 'Employed', 'val': '1'},
  {'ans': 'Unemployed', 'val': '2'},
  {'ans': 'Student', 'val': '3'},
  {'ans': 'Retired', 'val': '4'},
  {'ans': 'Part-time', 'val': '5'},
  {'ans': 'Prefer not to answer', 'val': '7'}],
 'bc2': [{'ans': 'Mother',
   'val': '1'},
  {'ans': 'Father ', 'val': '2'},
  {'ans': 'Brother', 'val': '3'},
  {'ans': 'Sister', 'val': '4'},
  {'ans': 'Grandmother', 'val': '4'},
  {'ans': 'Grandfather', 'val': '6'},
  {'ans': 'Son', 'val': '7'},
  {'ans': 'Daughter', 'val': '8'}]}

the expected output - a list that contains ONLY items with identical values per key - so only this

ab1: Other 3, Prefer not to answer 3
bc2: Sister 4, Grandmother 4

code I have tried it aims to reverse the dictionary first - but throws unhashable type list error i think because it treats it as a list when in fact the dict might be a tupple but i don't know how to change it

rev_dict = {}

for k, v in dic.items():
    rev_dict.setdefault(v, set()).add(k)
  
res = set(chain.from_iterable(v for k, v in rev_dict.items()
         if len(v) > 1))

Upvotes: 4

Views: 734

Answers (3)

fzzylogic
fzzylogic

Reputation: 2283

The panda's answer is certainly nicer:

lst = []
for i in dic.keys():
    counts = Counter([j['val'] for j in dic[i]])
    new = {j['ans']: j['val'] for j in dic[i] if counts[j['val']] > 1}
    lst.append(i + ': ' + ', '.join(['{} {}'.format(i, new[i]) for i in new])) if new else None

Upvotes: 1

timgeb
timgeb

Reputation: 78650

You've not specified an exact output format, but since you tagged pandas, here's a pandas solution.

import pandas as pd
{k: pd.DataFrame(v)[lambda df: df['val'].duplicated(keep=False)].to_dict(orient='records') for k, v in dic.items()}

Output:

{
    'ab1': [{'ans': 'Other', 'val': '3'},
            {'ans': 'Prefer not to answer', 'val': '3'}],
    'bc1': [],
    'bc2': [{'ans': 'Sister', 'val': '4'}, {'ans': 'Grandmother', 'val': '4'}]
}

Upvotes: 2

Pawan Jain
Pawan Jain

Reputation: 825

Import itertools and try this:

list(itertools.chain(*[[(k, i['ans'],i['val']) for i in v] for k, v in dic.items()]))

Long version

import itertools

lst = []
for k,v in dic.items():
    
    for i in v:
        tup = (k, i['ans'],i['val'])
        
        lst.append(tup)

list(itertools.chain(*lst))

Upvotes: 0

Related Questions