Reputation: 152
Suppose the following toyset (from a CSV file where column names are the "keys" and I'm only interested in some rows that I put in "data"):
keys = ['k1', 'k2', 'k3', 'k4']
data = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
I want to get a dictionary with a list for each column, like this:
{'k1': [1, 5, 9, 13], 'k2': [2, 6, 10, 14], 'k3': [3, 7, 11, 15], 'k4': [4, 8,
12, 16]}
In my code I first initialize the dictionary with empty lists and then iterate (in the order of the keys) to append each item in their list.
my_dict = dict.fromkeys(keys, [])
for row in data:
for i, k in zip(row, keys):
my_dict[k].append(i)
But it doesn't work. It builds this dictionary:
{'k3': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], 'k2': [1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], 'k1': [1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16], 'k4': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16]}
You can see that all the elements are in all lists instead of just four elements in each list. If I print i, k in the loop it does the correct pairs of items and keys. So I guess the problem is when I add item i in the list for key k.
Does anyone know why all elements are added to all lists and what would be the right way of building my dictionary?
Thanks in advance
Upvotes: 5
Views: 7696
Reputation: 309861
I think that dict(zip(keys, map(list,zip(*data)) ))
should do the trick.
First, I transpose your data (zip(*data)
), but that returns tuples... since you want lists, I use map to construct lists from the tuples. Then we use zip again to match keys with items in the list. e.g. (key1,list1), (key2,list2),...
. This is exactly what the dictionary constructor expects, so you're golden.
An alternate solution would be to use a collections.defaultdict
:
d=collections.defaultdict(list)
tdata=zip(*data) #transpose your data
for k,v in zip(keys,tdata):
d[k].extend(v)
Of course, this leaves you with a defaultdict instead of a regular one though it could be changed to a regular one trivially: d=dict(**d)
.
Upvotes: 4
Reputation: 64563
zip it but transpose it first:
>>> keys = ['k1', 'k2', 'k3', 'k4']
>>> data = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
>>> print dict(zip(keys, zip(*data)))
{'k3': (3, 7, 11, 15), 'k2': (2, 6, 10, 14), 'k1': (1, 5, 9, 13), 'k4': (4, 8, 12, 16)}
If you want lists not tuples in the array:
>>> print dict(zip(keys, [list(i) for i in zip(*data)]))
And if you want to use your version, just make dictionary comprehension, not fromkeys
:
my_dict = { k : [] for k in keys }
The problem in your case that you initialize my_dict
with the same value:
>>> my_dict = dict.fromkeys(keys, [])
>>> my_dict
{'k3': [], 'k2': [], 'k1': [], 'k4': []}
>>> my_dict['k3'].append(1)
>>> my_dict
{'k3': [1], 'k2': [1], 'k1': [1], 'k4': [1]}
When you do it right (with dictionary/list comprehension):
>>> my_dict = dict((k, []) for k in keys )
>>> my_dict
{'k3': [], 'k2': [], 'k1': [], 'k4': []}
>>> my_dict['k3'].append(1)
>>> my_dict
{'k3': [1], 'k2': [], 'k1': [], 'k4': []}
Upvotes: 9
Reputation: 133524
>>> keys = ['k1', 'k2', 'k3', 'k4']
>>> data = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
>>> dict(zip(keys, zip(*data)))
{'k3': (3, 7, 11, 15), 'k2': (2, 6, 10, 14), 'k1': (1, 5, 9, 13), 'k4': (4, 8, 12, 16)}
If you really need lists:
>>> dict(zip(keys, map(list, zip(*data))))
{'k3': [3, 7, 11, 15], 'k2': [2, 6, 10, 14], 'k1': [1, 5, 9, 13], 'k4': [4, 8, 12, 16]}
If you are using python 2, zip
and map
return list
s. If you are working with a large data set you can use itertools.izip
and itertools.imap
to be more efficient and avoid creating the intermediary lists.
Upvotes: 0
Reputation: 8958
That should work:
keys = ['k1', 'k2', 'k3', 'k4']
data = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
mydict = {}
for k in keys:
b[k] = []
for l in data:
b[k].append(l[i])
i += 1
Note that index() is an expensive function. Do not use it when you have a huge data set. increment a variable in that case.
edit: no it doesn't! sorry, just a moment
edit: now it works!
Upvotes: 0
Reputation: 601519
You are running into the issue explained in this answer: You dictionary is initialised with the same list object resued for all values. Simply use
dict(zip(keys, zip(*data)))
instead. This will transpose the list of rows into a list of columns, and then zip the keys and columns together.
Upvotes: 7