vando
vando

Reputation: 37

Pandas boolean condition from nested list of dictionaries

 [{'id': 123,
  'type': 'salary', #Parent node
  'tx': 'house',
  'sector': 'EU',
  'transition': [{'id': 'hash', #Child node
    'id': 123,
    'type': 'salary',
    'tx': 'house' }]},
 {'userid': 123,
  'type': 'salary', #Parent node
  'tx': 'office',
  'transition': [{'id': 'hash', # Child node
    'id': 123,
    'type': 'salary',
    'tx': 'office'}]}]

As a pandas column ('info') I have some information stored as a nested list of dictionaries like the example above.

What I'm trying to do is a boolean condition whether this list has the following attributes:

  1. More than one 'type' == 'salary' in any of all parents nodes
  2. Field 'tx' is different in any of all parents nodes with 'type' == 'salary'

So far I've tried to flatten a list and filter but it is not solving the first and seconds nodes

a = df.iloc[0].info
values = [item for sublist in [[list(i.values()) for i in a]][0]for item in sublist] 

Upvotes: 0

Views: 87

Answers (1)

Bushmaster
Bushmaster

Reputation: 4608

If you want to one line solution, you can use:

df['check'] = df['info'].apply(lambda x: True if sum([1 if i['type']=='salary' else 0 for i in x]) > 1 and [i['tx'] for i in x if i['type']=='salary'].count([i['tx'] for i in x if i['type']=='salary'][0]) != len([i['tx'] for i in x if i['type']=='salary'])  else False)

or (expanded):

def check(x):
    total_salary = sum([1 if i['type']=='salary' else 0 for i in x]) # get count of "type": "salary" matches
    tx_list = [i['tx'] for i in x if i['type']=='salary'] # get tx values when type==salary
    tx_check = tx_list.count(tx_list[0]) != len(tx_list) # check all values are same in tx_list
    if total_salary > 1 and tx_check:
        return True
    else:
        return False
df['check'] = df['info'].apply(check)

Upvotes: 1

Related Questions