Kali_89
Kali_89

Reputation: 617

Nested list comprehension in Python

I've got a list comprehension I'm trying to get my head around and I just can't seem to get what I'm after and thought I'd see if anybody else knew how!

My basic data structure is this:

structure = [[np.array([[1,2,3],[4,5,6]]), np.array([[7,8,9],[10,11,12]])], [np.array([[13,14,15],[16,17,18]]), np.array([[19,20,21],[22,23,24]])]]

So I've got an overall list containing sublists of numpy arrays and my desired output is some sort of grouping (don't care if it's a list or an array) with the following elements paired:

[1, 13]
[4, 16]
[2, 14]
[5, 17]
[3, 15]
[6, 18]

I thought I'd got it with the following style construct:

output = [structure[i][0][j] for j in range(9) for i in range(len(structure))] but alas, no joy.

I don't really mind if it needs more than one stage - just want to get those elements grouped together!

(as a bit of background - I've got lists of probabilities outputted from various models and within those models I've got a training list and a validation list:

[[model_1], [model_2], ..., [model_n]]

where [model_1] is [[training_set], [validation_set], [test_set]]

and [training_set] is np.array([p_1, p_2, ..., p_n],[p_1, p_2, ..., p_n],...])

I'd like to group together the prediction for item 1 for each of the models and create a training vector out of it of length equal to the number of models I've got. I'd then like to do the same but for the second row of [training_set].

If that doesn't make sense let me know!

Upvotes: 4

Views: 1107

Answers (4)

hpaulj
hpaulj

Reputation: 231385

Since all the arrays (and sublists) in structure are the same size you can turn it into one higher dimensional array:

In [189]: A=np.array(structure)
Out[189]: 
array([[[[ 1,  2,  3],
         [ 4,  5,  6]],

        [[ 7,  8,  9],
         [10, 11, 12]]],


       [[[13, 14, 15],
         [16, 17, 18]],

        [[19, 20, 21],
         [22, 23, 24]]]])

In [190]: A.shape
Out[190]: (2, 2, 2, 3)

Reshaping and swapaxes can give you all kinds of combinations.

For example, the values in your sample sublist can be selected with:

In [194]: A[:,0,:,:]
Out[194]: 
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[13, 14, 15],
        [16, 17, 18]]])

and reshape to get

In [197]: A[:,0,:,:].reshape(2,6)
Out[197]: 
array([[ 1,  2,  3,  4,  5,  6],
       [13, 14, 15, 16, 17, 18]])

and transpose to get the 6 rows of pairs:

In [198]: A[:,0,:,:].reshape(2,6).T
Out[198]: 
array([[ 1, 13],
       [ 2, 14],
       [ 3, 15],
       [ 4, 16],
       [ 5, 17],
       [ 6, 18]])

To get them in the 1,4,2,5.. order I can transpose first

In [208]: A[:,0,:,:].T.reshape(6,2)
Out[208]: 
array([[ 1, 13],
       [ 4, 16],
       [ 2, 14],
       [ 5, 17],
       [ 3, 15],
       [ 6, 18]])

Upvotes: 3

Kelvin17
Kelvin17

Reputation: 69

I think this is what you want like the format you have, it uses generators:

import numpy as np
structure = [[np.array([[1,2,3],[4,5,6]]), np.array([[7,8,9],[10,11,12]])], [np.array([[13,14,15],[16,17,18]]), np.array([[19,20,21],[22,23,24]])]]
struc = structure

my_gen = ([struc[i][j][k][l], struc[i+1][j][k][l]] for i in range(len(struc)-1)
                                     for j in range(len(struc[i]))
                                     for k in range(len(struc[i][j]))
                                     for l in range(len(struc[i][j][k])))

try:
    val = my_gen.next()
    while val != None:
        print val
        val = my_gen.next()
except:
    pass

Upvotes: 0

Padraic Cunningham
Padraic Cunningham

Reputation: 180441

Not sure exactly what full output you want but this may help:

imort numpy as np

structure = [[np.array([[1, 2, 3], [4, 5, 6]]), np.array([[7, 8, 9], [10, 11, 12]])],
             [np.array([[13, 14, 15], [16, 17, 18]]), np.array([[19, 20, 21], [22, 23, 24]])]]

from itertools import chain

zipped = (zip(*ele) for ele in zip(*next(zip(*structure))))

print (list(chain.from_iterable(zip(*zipped))))
[(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]

Ok a breakdown of the witchcraft:

# transpose sub arrays so column 0 is the first two sub elements from 
# each sub array
In [4]: start = zip(*structure)

In [5]: start
Out[5]: 
[(array([[1, 2, 3],
         [4, 5, 6]]), array([[13, 14, 15],
         [16, 17, 18]])), (array([[ 7,  8,  9],
         [10, 11, 12]]), array([[19, 20, 21],
         [22, 23, 24]]))]

# our interesting sub array's i.e colunm[0]
In [6]: first_col = next(start)

In [7]: first_col
Out[7]: 
(array([[1, 2, 3],
        [4, 5, 6]]), array([[13, 14, 15],
        [16, 17, 18]]))

# pair up corresponding sub array's
In [8]: intersting_pairs = zip(*first_col)

In [9]: intersting_pairs
Out[9]: 
[(array([1, 2, 3]), array([13, 14, 15])),
 (array([4, 5, 6]), array([16, 17, 18]))]

# pair them up (1, 13), (2, 14) ...
In [10]: create_final_pairings = [zip(*ele) for ele in intersting_pairs]

In [11]: create_final_pairings
Out[11]: [[(1, 13), (2, 14), (3, 15)], [(4, 16), (5, 17), (6, 18)]]

Finally chain all into a single flat list and get the order correct:

In [13]: from itertools import chain
# create flat list 
In [14]: flat_list = list(chain.from_iterable(zip(*create_final_pairings))

In [15]: flat_list
Out[15]: [(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]

A simple example of transposing with zip may help:

In [17]: l = [[1,2,3],[4,5,6]]

In [18]: zip(*l)
Out[18]: [(1, 4), (2, 5), (3, 6)]

In [19]: zip(*l)[0]
Out[19]: (1, 4)

In [20]: zip(*l)[1]
Out[20]: (2, 5)

In [21]: zip(*l)[2]
Out[21]: (3, 6)

For python2 you can use itertools.izip:

from itertools import chain, izip


zipped = (izip(*ele) for ele in izip(*next(izip(*structure))))
print (list(chain.from_iterable(izip(*zipped))))

[(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]

Upvotes: 2

Brendan Long
Brendan Long

Reputation: 54242

I had to write the non-list-comprehension version first to get my head around this:

new_training_vector = []
for m1, m2 in zip(structure[0], structure[1]):
    for t1, t2 in zip(m1, m2):
        for d1, d2 in zip(t1, t2):
            new_training_vector.append([d1, d2])

The way it works is by creating two parallel iterators (using zip), one for each model, then creating two parallel iterators for each of the training sets and so on until we get to the actual data and can just stick it together.

Once we have that, it's not hard to go fold it into a list comprehension:

new_training_vector = [[d1, d2]
                       for m1, m2 in zip(structure[0], structure[1])
                       for t1, t2 in zip(m1, m2)
                       for d1, d2 in zip(t1, t2)]

You can also do this with a dictionary, if that works better for some reason. You would lose the order though:

import collections
d = collections.defaultdict(list)
for model in structure:
    for i, training_set in enumerate(model):
        for j, row in enumerate(training_set):
            for k, point in enumerate(row):
                d[(i, j, k)].append(point)

The trick to this one is that we just keep track of where we saw each point (except for at the model level), so they automatically go into the same dict item.

Upvotes: 2

Related Questions