Amelio Vazquez-Reina
Amelio Vazquez-Reina

Reputation: 96398

Turn dictionary of lists into a 1D numpy array

I have a Python 3 dictionary holding very long lists (30 million integers each). I would like to stitch all these lists into a single numpy array. How can I do this efficiently?

The following

np.array(my_dict.values())

doesn't seem to work (I get array(dict_values([[...], [....])) as opposed to a flat 1D numpy array).

Upvotes: 5

Views: 23285

Answers (5)

plhn
plhn

Reputation: 5273

try this

list(my_dict.values())


(commentary added)

dict.values() returns view not a list. Refer here

Upvotes: 6

Padraic Cunningham
Padraic Cunningham

Reputation: 180482

from itertools import chain
import numpy as np
chn = chain.from_iterable(d.values())
np.array(list(chn))

Upvotes: 4

Saullo G. P. Castro
Saullo G. P. Castro

Reputation: 58985

In order to get an ordered concatenation based on the keys:

np.array([d[k] for k in sorted(d.keys())]).flatten()

if you don't need any order based on the keys, @Padraic Cunningham's approach was the fastest based on my timings here...

Upvotes: 1

Alex Riley
Alex Riley

Reputation: 176978

If you're looking for a flat 1d array, you could just use np.concatenate:

>>> d = {'a': [1, 2, 3, 4, 5], 'b': [1, 2, 3, 4, 5], 'c': [1, 2, 3, 4, 5]}
>>> np.concatenate(list(d.values()))
array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5])

Upvotes: 9

Mike
Mike

Reputation: 7203

Allocate numpy arrays ahead of time:

my_dict = {0:[0,3,2,1], 1:[4,2,1,3], 2:[3,4,2,1]}
array = numpy.ndarray((len(my_dict), len(my_dict.values()[0]))

then you can insert them into the array like so:

for index, val in enumerate(my_dict.values()):
    arr[index] = val
>>> arr
array([[ 0.,  3.,  2.,  1.],
       [ 4.,  2.,  1.,  3.],
       [ 3.,  4.,  2.,  1.]])

Upvotes: 1

Related Questions