kavya8
kavya8

Reputation: 149

Iterate through list of dictionary and identify similar values in dictionary in Python

Suppose I have a list of dictionary as below

[{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2']}, {'name': 'User_ORDERS1235', 'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 'users': ['User_1']}, {'name': 'User_ORDERS1236', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_3']}]

On iterate of this list I want to check if the expressions(key) a sub list values are same as some other set of dictionary key expression values.In the above case users key with value-User_2 has same expression values as User_3 .In this case I want to delete the entire dictionary of User_3 and add append the value User_3 to User_2 list(as 'Users':['User_2','User_3'])

exprected output:

[{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2','User_3']}, {'name': 'User_ORDERS1235', 'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 'users': ['User_1']}]

Upvotes: 1

Views: 807

Answers (4)

Yuri Khristich
Yuri Khristich

Reputation: 14502

orders = [{
    'name': 'User_ORDERS1234',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
    'users': ['User_2']
},{
    'name': 'User_ORDERS1235',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}],
    'users': ['User_1']
},{
    'name': 'User_ORDERS1236',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
    'users': ['User_3']
}]

for i, order in enumerate(orders):                # loop trough orders:
    exp1 = order['expressions']                   # 'exp' value of the order

    for next_order in orders[i+1:]:               # loop through the next orders:
        exp2 = next_order['expressions']          # 'exp' value of a next order

        if exp1 == exp2:                          # if the 'exp' values are the same:
            order['users'] += next_order['users'] # add the 'users' to the order 'users'
            next_order['users'] = []              # remove users from the next order

orders = [o for o in orders if o['users']]        # leave only the orders that have 'users'

print(orders)

Output

[{
    'name': 'User_ORDERS1234',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
    'users': ['User_2', 'User_3']
},{
    'name': 'User_ORDERS1235',
    'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}],
    'users': ['User_1']
}]

Upvotes: 1

gmdev
gmdev

Reputation: 3155

You can use enumerate to get the index and value of each order in the list of orders. scanned_exp is a dictionary with the unique expression as the key and the value is the index in the list of orders in which the first occurrence of the unique expression was found. When iterating, we check if the current expression has already been scanned, i.e., in scanned_exp. If it has been found already, we extend the list of users at the index position of the first occurrence of that expression with the list of users from the current expression. We then delete the current order from the list using remove.

scanned_exp = {}
for idx, order in enumerate(d):
    exp = order["expressions"][0]["exp"]
    if exp in scanned_exp:
        d[scanned_exp[exp]]["users"].extend(order["users"])
        d.remove(order)
    else:
        scanned_exp[exp] = idx

Your output then becomes:

[
    {
        'name': 'User_ORDERS1234', 
        'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 
        'users': ['User_2', 'User_3']
    }, 
    {
        'name': 'User_ORDERS1235', 
        'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 
        'users': ['User_1']
    }
]

Edit

Okay, let's make this dynamic. Firstly, the keys of a dictionary cannot be lists (unhashable type), so this breaks our original implementation. above. A collection that is able to be used as a key is tuple (unless the tuple contains unhashable types, i.e., list, dict). What we can do is make a tuple that contains all of the string values that appear as a value in the exp key.

So, you can replace this:

exp = order["expressions"][0]["exp"]

with this:

exp = tuple(e["exp"] for e in order["expressions"])

Upvotes: 1

coderoftheday
coderoftheday

Reputation: 2075

dictt = [{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2']}, {'name': 'User_ORDERS1235', 'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 'users': ['User_1']}, {'name': 'User_ORDERS1236', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_3']}]



def sorting_it(d):
    for n,c in enumerate([x['expressions'] for x in dictt]):
        if c == d['expressions'] and dictt[n] != d and d['users']:
            d['users'] = d['users'] + dictt[n]['users']
            del dictt[n]
f = list(map(sorting_it,dictt))

print(dictt)

>>> [{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2', 'User_3']}, {'name': 'User_ORDERS1235', 'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 'users': ['User_1']}]

Explanation:

f = list(map(sorting_it,dictt))

using the map function, every dictionary in dictt is passed through function sorting_it one at a time as the variable d, so first is:

{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2']}

Now I'm looping through the the values of key 'expressions', [x['expressions'] for x in dictt] is the list for this

If the value of key 'expressions' in dictt d is equal to the value of key 'expressions' in [x['expressions'] for x in dictt] then I get the index n, use this to find the corresponding dictionary in dictt and add all the values for key 'expressions' together.

I then do del dictt[n] since the user for that dictionary has already been added to another dictionary, so in this case dictionary for 'user_3' is deleted since they were added to dictionary for 'user_2'.

Also dictt[n] != d and d['users'] makes sure I'm not comparing the same dictionary.

Upvotes: 0

gañañufla
gañañufla

Reputation: 562

def function_1(values):
  for j in range(len(values)):
    for k in range(j + 1, len(values)):
      if values[j]['expressions'] == values[k]['expressions']:
        values[j]['users'] = values[j]['users'] + values[k]['users'] 
  return values

#In the performance

list_values = [{'name': 'User_ORDERS1234', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_2']}, {'name': 'User_ORDERS1235', 'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}], 'users': ['User_1']}, {'name': 'User_ORDERS1236', 'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}], 'users': ['User_3']}]

#call the function

function_1(list_values)

[{'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
  'name': 'User_ORDERS1234',
  'users': ['User_2', 'User_3']},
 {'expressions': [{'exp': '"table"."ORDERS"."STATUS"  = \'Shipped\''}],
  'name': 'User_ORDERS1235',
  'users': ['User_1']},
 {'expressions': [{'exp': '"table"."ORDERS"."STATUS" IN (\'Canceled\',\'Pending\')'}],
  'name': 'User_ORDERS1236',
  'users': ['User_3']}]
[ ]

Upvotes: 0

Related Questions