Reputation: 3262
I have 2 lists of dictionaries.
list1 = [{'user_id':23, 'user_name':'John', 'age':30},
{'user_id':24, 'user_name':'Shaun', 'age':31},
{'user_id':25, 'user_name':'Johny', 'age':32}]
list2 =[{'user_id':23},
{'user_id':25}]
Now I want the output
list3 = [{'user_id':23, 'user_name':'John', 'age':30},
{'user_id':25, 'user_name':'Johny','age':32}]
I want the most efficient way because my list1
might contain millions of rows.
Upvotes: 4
Views: 1847
Reputation: 6121
you can use pandas to merge to dataframe together.
1. convert dict to dataframe
2. merge two dataframes on "user_id"
import pandas as pd
list1 = [{'user_id':23, 'user_name':'John', 'age':30},
{'user_id':24, 'user_name':'Shaun', 'age':31},
{'user_id':25, 'user_name':'Johny', 'age':32}]
list2 =[{'user_id':23},
{'user_id':25}]
df1 = pd.DataFrame(list1)
df1
age user_id user_name
0 30 23 John
1 31 24 Shaun
2 32 25 Johny
df2 = pd.DataFrame(list2)
df2
user_id
0 23
1 25
pd.merge(df2,df1,on='user_id')
user_id age user_name
0 23 30 John
1 25 32 Johny
Upvotes: 0
Reputation: 7432
Like previous posters said you need to create a list of the IDs from list 2:
list2_ids = {d['user_id'] for d in list2}
After you've done this, you can also use the filter function:
filter(lambda x: x['user_id'] in list2_ids, list1)
This, while not optimized has the benefit of having multiple implementations for parallel computations (which you might need if you're dealing with a large amount of data.
That being said the best solution performance-wise is probably set intersection (comparison):
unique_ids = set([d['user_id'] for d in list1]) & set([d['user_id'] for d in list2])
list3 = [x for x in list1 if x['user_id'] in unique_ids]
If you are sure the lists don't contain duplicates you can ignore set.
Upvotes: 0
Reputation: 10631
I would transform your list1
into a dictionary when the key is the user_id
and the value is the name
and age
.
Now, when you look up at this dict
even if the dict
has a lot of elements, the complexity is O(1)
, for find.
In that case, the entire complexity of finding all user id's is O(len(list2))
dict1 = {23 : {'user_name':'John', 'age':30},
24 : {'user_name':'Shaun', 'age':31},
25 : {'user_name':'Johny', 'age':32}}
list2 =[{'user_id':23},
{'user_id':25}]
res = [dict1.get(user['user_id']) for user in list2 if user['user_id'] in dict1]
print (res)
>>> [{'user_name': 'John', 'age': 30}, {'user_name': 'Johny', 'age': 32}]
Upvotes: 1
Reputation: 140188
you'll have to transform list2
a little bit to get a fast lookup. I'd make a set
out of it
list1 = [{'user_id':23, 'user_name':'John','age':30},
{'user_id':24, 'user_name':'Shaun','age':31},
{'user_id':25, 'user_name':'Johny','age':32}]
list2 =[{'user_id':23},
{'user_id':25}]
list2_ids = {d['user_id'] for d in list2}
then build list3
using a filtered list comprehension. In that case in list2_ids
is very fast because it uses the lookup from set
and not linear search:
list3 = [x for x in list1 if x['user_id'] in list2_ids]
print(list3)
result:
[{'user_id': 23, 'user_name': 'John', 'age': 30}, {'user_id': 25, 'user_name': 'Johny', 'age': 32}]
Upvotes: 6