datatech
datatech

Reputation: 147

Converting multidimensional array into arrays in list for every row

I have a data set like this,

data = np.array([[ 5, 31, 61],
                 [ 10, 31, 67],
                 [ 15, 31, 69],
                 [ 4, 31, 72],
                 [ 14, 31, 73],
                 [ 21, 31, 77],
                 [ 19, 31, 78]])

I want to convert it into arrays in list for every single row. I tried,

np.split(data,len(data))

#[array([[ 5, 31, 61]]),
# array([[10, 31, 67]]),
# array([[15, 31, 69]]),
# array([[ 4, 31, 72]]),
# array([[14, 31, 73]]),
# array([[21, 31, 77]]),
# array([[19, 31, 78]])]

But as you can see it gives double [ to me. What I simply want is;

[np.array([5, 31, 61]),
np.array([10, 31, 67]),
np.array([15, 31, 69]),
np.array([4, 31, 72]),
np.array([14, 31, 73]),
np.array([21, 31, 77]),
np.array([19, 31, 78])]

Upvotes: 2

Views: 81

Answers (2)

mathfux
mathfux

Reputation: 5949

np.split could be applied too but you are required to do it one-dimensionally. So you might like to create a one-dimensional view of your data first:

%%timeit
data_ravel = data.ravel()
out = np.split(data_ravel, len(data))
>>> 14.5 µs ± 337 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Note that creating a view is costless (it took 0.13 µs on my computer)

Internally, it's being done like so:

out = []
div_points = range(0, data.size+1, data.shape[1])
start = div_points[:-1]
end = div_points[1:]
out = list(data_ravel[i:j] for i,j in zip(start, end))
>>> 2.31 µs ± 44.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Note that it's faster because I'd doing a couple of optimisations here:

  • using range instead of np.array
  • using lazy list comprehension instead of list.append

However, it can't compete with classical methods like in @mozway 's answer. They are optimal:

%%timeit
out = [*data]
>>> 902 ns ± 8.09 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%%timeit
out = list(data)
>>> 979 ns ± 12.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%%timeit
out = [n for n in data]
>>> 1.04 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%%timeit
out = list(n for n in data)
>>> 1.37 µs ± 80.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Upvotes: 1

mozway
mozway

Reputation: 262284

What about taking advantage of unpacking?

lst = [*data]

or:

lst = list(data)

output:

[array([ 5, 31, 61]),
 array([10, 31, 67]),
 array([15, 31, 69]),
 array([ 4, 31, 72]),
 array([14, 31, 73]),
 array([21, 31, 77]),
 array([19, 31, 78])]

Upvotes: 1

Related Questions