user2308900
user2308900

Reputation: 21

How to make a dictionary from multiple lists?

I have a number of lists that correspond to each other like this:

ID_number = [1, 2, 3, 4, 5, 6, ...]
x_pos = [43.2, 53.21, 34.2, ...]
y_pos = [32.1, 42.1, 8.2, ...]
z_pos = [1.3, 67.1, 24.3, ...]

etc.

I want to be able to sort, pull, and perform operations on the data based on the ID_number, so I want to make a dictionary from these lists like this,

dictionary = {'id1':[x_pos1, y_pos1, z_pos1], 'id2':[x_pos2, y_pos2, z_pos2], ...}

where the key is the ID number and the value is a list containing the corresponding data for that ID number. How would I go about doing this efficiently in python?

Upvotes: 2

Views: 151

Answers (3)

Bakuriu
Bakuriu

Reputation: 101989

Use zip twice:

>>> ids = [1,2,3,4]
>>> x_pos = [1.32, 2.34, 5.56, 8.79]
>>> y_pos = [1.2, 2.3, 3.4, 4.5]
>>> z_pos = [3.33, 2.22, 10.98, 10.1]
>>> dict(zip(ids, zip(x_pos, y_pos, z_pos)))
{1: (1.32, 1.2, 3.33), 2: (2.34, 2.3, 2.22), 3: (5.56, 3.4, 10.98), 4: (8.79, 4.5, 10.1)}

Timing comparison with the genexp:

>>> import timeit
>>> timeit.timeit('dict(zip(ids, zip(x_pos, y_pos, z_pos)))', 'from __main__ import ids, x_pos, y_pos, z_pos')
1.6184730529785156
>>> timeit.timeit('dict((x[0], x[1:]) for x in zip(ids, x_pos, y_pos, z_pos))', 'from __main__ import ids, x_pos, y_pos, z_pos')
2.5186140537261963

So, using zip twice is about 1.5x times faster than using the generator expression. Obviously the results depend on the size of the iterables but I'm quite confident on the fact that using double zip, at least on CPython 2 will always be faster than explicit loops. Generator exceptions or for loops require much more work for the interpreter than the single call to zip, which removes some overhead from the iteration process.

Using itertools.izip instead of zip doesn't change much the timings but is a lot more memory efficient for big data sets.

Upvotes: 4

Elmar Peise
Elmar Peise

Reputation: 15463

dictionary = {'id' + str(i): [x, y, z]
              for i, x, y, z in zip(ID_number, x_pos, y_pos, z_pos)}

for large data-sets probably faster with itertools' izip().

Upvotes: 0

cdhowie
cdhowie

Reputation: 169143

zip() is quite useful to accomplish this. For example:

>>> ID_number = [1,2,3]
>>> x_pos = [43.2, 53.21, 34.2]
>>> y_pos = [32.1, 42.1, 8.2]
>>> z_pos = [1.3, 67.1, 24.3]
>>> dict((x[0], x[1:]) for x in zip(ID_number, x_pos, y_pos, z_pos))
{1: (43.200000000000003, 32.100000000000001, 1.3), 2: (53.210000000000001, 42.100000000000001, 67.099999999999994), 3: (34.200000000000003, 8.1999999999999993, 24.300000000000001)}

If the data set is quite large, you can avoid zip()'s creation of an entirely new copy of the whole data set by using itertools.izip() instead. This function will return an iterator that will provide each zipped element when requested instead of holding the whole new structure in memory. (The result will be the same, but it should be faster on larger data sets.)

>>> import itertools
>>> dict((x[0], x[1:]) for x in itertools.izip(ID_number, x_pos, y_pos, z_pos))
{1: (43.200000000000003, 32.100000000000001, 1.3), 2: (53.210000000000001, 42.100000000000001, 67.099999999999994), 3: (34.200000000000003, 8.1999999999999993, 24.300000000000001)}

Upvotes: 2

Related Questions