BigD
BigD

Reputation: 581

How to merge two lists into dictionary without using nested for loop

I have two lists:

a = [0, 0, 0, 1, 1, 1, 1, 1, .... 99999]
b = [24, 53, 88, 32, 45, 24, 88, 53, ...... 1]

I want to merge those two lists into a dictionary like:

{
    0: [24, 53, 88], 
    1: [32, 45, 24, 88, 53], 
    ...... 
    99999: [1]
}

A solution might be using for loop, which does not look good and elegant, like:

d = {}
unique_a = list(set(list_a))
for i in range(len(list_a)):
    if list_a[i] in d.keys:
        d[list_a[i]].append(list_b[i])
    else:
        d[list_a] = [list_b[i]]

Though this does work, it’s an inefficient and would take too much time when the list is extremely large. I want to know more elegant ways to construct such a dictionary?

Thanks in advance!

Upvotes: 19

Views: 7861

Answers (7)

U13-Forward
U13-Forward

Reputation: 71610

Or do dictionary comprehension beforehand, then since all keys are there with values of empty lists, iterate trough the zip of the two lists, then add the second list's value to the dictionary's key naming first list's value, no need for try-except clause (or if statements), to see if the key exists or not, because of the beforehand dictionary comprehension:

d={k:[] for k in l}
for x,y in zip(l,l2):
   d[x].append(y)

Now:

print(d)

Is:

{0: [24, 53, 88], 1: [32, 45, 24, 88, 53], 9999: [1]}

Upvotes: 0

cs95
cs95

Reputation: 402844

No fancy structures, just a plain ol' dictionary.

d = {}
for x, y in zip(a, b):
    d.setdefault(x, []).append(y)

Upvotes: 6

Ajax1234
Ajax1234

Reputation: 71461

You can use a defaultdict:

from collections import defaultdict
d = defaultdict(list)
list_a = [0, 0, 0, 1, 1, 1, 1, 1, 9999]
list_b = [24, 53, 88, 32, 45, 24, 88, 53, 1]
for a, b in zip(list_a, list_b):
   d[a].append(b)

print(dict(d))

Output:

{0: [24, 53, 88], 1: [32, 45, 24, 88, 53], 9999: [1]}

Upvotes: 34

RomanPerekhrest
RomanPerekhrest

Reputation: 92874

Alternative itertools.groupby() solution:

import itertools

a = [0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3]
b = [24, 53, 88, 32, 45, 24, 88, 53, 11, 22, 33, 44, 55, 66, 77]

result = { k: [i[1] for i in g] 
           for k,g in itertools.groupby(sorted(zip(a, b)), key=lambda x:x[0]) }
print(result)

The output:

{0: [24, 53, 88], 1: [24, 32, 45, 53, 88], 2: [11, 22, 33, 44, 55, 66], 3: [77]}

Upvotes: 14

FabienP
FabienP

Reputation: 3148

A pandas solution:

Setup:

import pandas as pd

a = [0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 3, 4, 4, 4]

b = pd.np.random.randint(0, 100, len(a)).tolist()

>>> b
Out[]: [28, 68, 71, 25, 25, 79, 30, 50, 17, 1, 35, 23, 52, 87, 21]


df = pd.DataFrame(columns=['Group', 'Value'], data=list(zip(a, b)))  # Create a dataframe

>>> df
Out[]:
    Group  Value
0       0     28
1       0     68
2       0     71
3       1     25
4       1     25
5       1     79
6       1     30
7       1     50
8       2     17
9       2      1
10      2     35
11      3     23
12      4     52
13      4     87
14      4     21

Solution:

>>> df.groupby('Group').Value.apply(list).to_dict()
Out[]:
{0: [28, 68, 71],
 1: [25, 25, 79, 30, 50],
 2: [17, 1, 35],
 3: [23],
 4: [52, 87, 21]}

Walkthrough:

  1. create a pd.DataFrame from the input lists, a is called Group and b called Value
  2. df.groupby('Group') creates groups based on a
  3. .Value.apply(list) gets the values for each group and cast it to list
  4. .to_dict() converts the resulting DataFrame to dict

Timing:

To get an idea of timings for a test set of 1,000,000 values in 100,000 groups:

a = sorted(np.random.randint(0, 100000, 1000000).tolist())
b = pd.np.random.randint(0, 100, len(a)).tolist()
df = pd.DataFrame(columns=['Group', 'Value'], data=list(zip(a, b)))

>>> df.shape
Out[]: (1000000, 2)

%timeit df.groupby('Group').Value.apply(list).to_dict()
4.13 s ± 9.29 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

But to be honest it is likely less efficient than itertools.groupby suggested by @RomanPerekhrest, or defaultdict suggested by @Ajax1234.

Upvotes: 3

Engineero
Engineero

Reputation: 12938

You can do this with a dict comprehension:

list_a = [0, 0, 0, 1, 1, 1, 1, 1]
list_b = [24, 53, 88, 32, 45, 24, 88, 53]
my_dict = {key: [] for key in set(a)}  # my_dict = {0: [], 1: []}
for a, b in zip(list_a, list_b):
    my_dict[a].append(b)
# {0: [24, 53, 88], 1: [32, 45, 24, 88, 53]}

Oddly enough, you cannot seem to make this work using dict.fromkeys(set(list_a), []) as this will set the value of all keys equal to the same empty array:

my_dict = dict.fromkeys(set(list_a), [])  # my_dict = {0: [], 1: []}
my_dict[0].append(1)  # my_dict = {0: [1], 1: [1]}

Upvotes: 3

Giorgi Jambazishvili
Giorgi Jambazishvili

Reputation: 743

Maybe I miss the point, but at least I will try to help. If you have to lists and want to put them in the dict do the following

a = [1, 2, 3, 4]
b = [5, 6, 7, 8]
lists = [a, b] # or directly -> lists = [ [1, 2, 3, 4], [5, 6, 7, 8] ]
new_dict = {}
for idx, sublist in enumerate([a, b]): # or enumerate(lists)
    new_dict[idx] = sublist

hope it helps

Upvotes: 2

Related Questions