Reputation: 581
I have two lists:
a = [0, 0, 0, 1, 1, 1, 1, 1, .... 99999]
b = [24, 53, 88, 32, 45, 24, 88, 53, ...... 1]
I want to merge those two lists into a dictionary like:
{
0: [24, 53, 88],
1: [32, 45, 24, 88, 53],
......
99999: [1]
}
A solution might be using for
loop, which does not look good and elegant, like:
d = {}
unique_a = list(set(list_a))
for i in range(len(list_a)):
if list_a[i] in d.keys:
d[list_a[i]].append(list_b[i])
else:
d[list_a] = [list_b[i]]
Though this does work, it’s an inefficient and would take too much time when the list is extremely large. I want to know more elegant ways to construct such a dictionary?
Thanks in advance!
Upvotes: 19
Views: 7861
Reputation: 71610
Or do dictionary comprehension beforehand, then since all keys are there with values of empty lists, iterate trough the zip
of the two lists, then add the second list's value to the dictionary's key naming first list's value, no need for try-except clause (or if statements), to see if the key exists or not, because of the beforehand dictionary comprehension:
d={k:[] for k in l}
for x,y in zip(l,l2):
d[x].append(y)
Now:
print(d)
Is:
{0: [24, 53, 88], 1: [32, 45, 24, 88, 53], 9999: [1]}
Upvotes: 0
Reputation: 402844
No fancy structures, just a plain ol' dictionary.
d = {}
for x, y in zip(a, b):
d.setdefault(x, []).append(y)
Upvotes: 6
Reputation: 71461
You can use a defaultdict:
from collections import defaultdict
d = defaultdict(list)
list_a = [0, 0, 0, 1, 1, 1, 1, 1, 9999]
list_b = [24, 53, 88, 32, 45, 24, 88, 53, 1]
for a, b in zip(list_a, list_b):
d[a].append(b)
print(dict(d))
Output:
{0: [24, 53, 88], 1: [32, 45, 24, 88, 53], 9999: [1]}
Upvotes: 34
Reputation: 92874
Alternative itertools.groupby()
solution:
import itertools
a = [0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3]
b = [24, 53, 88, 32, 45, 24, 88, 53, 11, 22, 33, 44, 55, 66, 77]
result = { k: [i[1] for i in g]
for k,g in itertools.groupby(sorted(zip(a, b)), key=lambda x:x[0]) }
print(result)
The output:
{0: [24, 53, 88], 1: [24, 32, 45, 53, 88], 2: [11, 22, 33, 44, 55, 66], 3: [77]}
Upvotes: 14
Reputation: 3148
A pandas
solution:
import pandas as pd
a = [0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 3, 4, 4, 4]
b = pd.np.random.randint(0, 100, len(a)).tolist()
>>> b
Out[]: [28, 68, 71, 25, 25, 79, 30, 50, 17, 1, 35, 23, 52, 87, 21]
df = pd.DataFrame(columns=['Group', 'Value'], data=list(zip(a, b))) # Create a dataframe
>>> df
Out[]:
Group Value
0 0 28
1 0 68
2 0 71
3 1 25
4 1 25
5 1 79
6 1 30
7 1 50
8 2 17
9 2 1
10 2 35
11 3 23
12 4 52
13 4 87
14 4 21
>>> df.groupby('Group').Value.apply(list).to_dict()
Out[]:
{0: [28, 68, 71],
1: [25, 25, 79, 30, 50],
2: [17, 1, 35],
3: [23],
4: [52, 87, 21]}
pd.DataFrame
from the input lists, a
is called Group
and b
called Value
df.groupby('Group')
creates groups based on a
.Value.apply(list)
gets the values for each group and cast it to list
.to_dict()
converts the resulting DataFrame
to dict
To get an idea of timings for a test set of 1,000,000 values in 100,000 groups:
a = sorted(np.random.randint(0, 100000, 1000000).tolist())
b = pd.np.random.randint(0, 100, len(a)).tolist()
df = pd.DataFrame(columns=['Group', 'Value'], data=list(zip(a, b)))
>>> df.shape
Out[]: (1000000, 2)
%timeit df.groupby('Group').Value.apply(list).to_dict()
4.13 s ± 9.29 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
But to be honest it is likely less efficient than itertools.groupby
suggested by @RomanPerekhrest, or defaultdict
suggested by @Ajax1234.
Upvotes: 3
Reputation: 12938
You can do this with a dict comprehension:
list_a = [0, 0, 0, 1, 1, 1, 1, 1]
list_b = [24, 53, 88, 32, 45, 24, 88, 53]
my_dict = {key: [] for key in set(a)} # my_dict = {0: [], 1: []}
for a, b in zip(list_a, list_b):
my_dict[a].append(b)
# {0: [24, 53, 88], 1: [32, 45, 24, 88, 53]}
Oddly enough, you cannot seem to make this work using dict.fromkeys(set(list_a), [])
as this will set the value of all keys equal to the same empty array:
my_dict = dict.fromkeys(set(list_a), []) # my_dict = {0: [], 1: []}
my_dict[0].append(1) # my_dict = {0: [1], 1: [1]}
Upvotes: 3
Reputation: 743
Maybe I miss the point, but at least I will try to help. If you have to lists and want to put them in the dict do the following
a = [1, 2, 3, 4]
b = [5, 6, 7, 8]
lists = [a, b] # or directly -> lists = [ [1, 2, 3, 4], [5, 6, 7, 8] ]
new_dict = {}
for idx, sublist in enumerate([a, b]): # or enumerate(lists)
new_dict[idx] = sublist
hope it helps
Upvotes: 2