Reputation: 617
I've got a list comprehension I'm trying to get my head around and I just can't seem to get what I'm after and thought I'd see if anybody else knew how!
My basic data structure is this:
structure = [[np.array([[1,2,3],[4,5,6]]), np.array([[7,8,9],[10,11,12]])], [np.array([[13,14,15],[16,17,18]]), np.array([[19,20,21],[22,23,24]])]]
So I've got an overall list containing sublists of numpy arrays and my desired output is some sort of grouping (don't care if it's a list or an array) with the following elements paired:
[1, 13]
[4, 16]
[2, 14]
[5, 17]
[3, 15]
[6, 18]
I thought I'd got it with the following style construct:
output = [structure[i][0][j] for j in range(9) for i in range(len(structure))]
but alas, no joy.
I don't really mind if it needs more than one stage - just want to get those elements grouped together!
(as a bit of background - I've got lists of probabilities outputted from various models and within those models I've got a training list and a validation list:
[[model_1], [model_2], ..., [model_n]]
where [model_1]
is [[training_set], [validation_set], [test_set]]
and [training_set]
is np.array([p_1, p_2, ..., p_n],[p_1, p_2, ..., p_n],...])
I'd like to group together the prediction for item 1 for each of the models and create a training vector out of it of length equal to the number of models I've got. I'd then like to do the same but for the second row of [training_set].
If that doesn't make sense let me know!
Upvotes: 4
Views: 1107
Reputation: 231385
Since all the arrays (and sublists) in structure
are the same size you can turn it into one higher dimensional array:
In [189]: A=np.array(structure)
Out[189]:
array([[[[ 1, 2, 3],
[ 4, 5, 6]],
[[ 7, 8, 9],
[10, 11, 12]]],
[[[13, 14, 15],
[16, 17, 18]],
[[19, 20, 21],
[22, 23, 24]]]])
In [190]: A.shape
Out[190]: (2, 2, 2, 3)
Reshaping and swapaxes can give you all kinds of combinations.
For example, the values in your sample sublist can be selected with:
In [194]: A[:,0,:,:]
Out[194]:
array([[[ 1, 2, 3],
[ 4, 5, 6]],
[[13, 14, 15],
[16, 17, 18]]])
and reshape to get
In [197]: A[:,0,:,:].reshape(2,6)
Out[197]:
array([[ 1, 2, 3, 4, 5, 6],
[13, 14, 15, 16, 17, 18]])
and transpose to get the 6 rows of pairs:
In [198]: A[:,0,:,:].reshape(2,6).T
Out[198]:
array([[ 1, 13],
[ 2, 14],
[ 3, 15],
[ 4, 16],
[ 5, 17],
[ 6, 18]])
To get them in the 1,4,2,5..
order I can transpose first
In [208]: A[:,0,:,:].T.reshape(6,2)
Out[208]:
array([[ 1, 13],
[ 4, 16],
[ 2, 14],
[ 5, 17],
[ 3, 15],
[ 6, 18]])
Upvotes: 3
Reputation: 69
I think this is what you want like the format you have, it uses generators:
import numpy as np
structure = [[np.array([[1,2,3],[4,5,6]]), np.array([[7,8,9],[10,11,12]])], [np.array([[13,14,15],[16,17,18]]), np.array([[19,20,21],[22,23,24]])]]
struc = structure
my_gen = ([struc[i][j][k][l], struc[i+1][j][k][l]] for i in range(len(struc)-1)
for j in range(len(struc[i]))
for k in range(len(struc[i][j]))
for l in range(len(struc[i][j][k])))
try:
val = my_gen.next()
while val != None:
print val
val = my_gen.next()
except:
pass
Upvotes: 0
Reputation: 180441
Not sure exactly what full output you want but this may help:
imort numpy as np
structure = [[np.array([[1, 2, 3], [4, 5, 6]]), np.array([[7, 8, 9], [10, 11, 12]])],
[np.array([[13, 14, 15], [16, 17, 18]]), np.array([[19, 20, 21], [22, 23, 24]])]]
from itertools import chain
zipped = (zip(*ele) for ele in zip(*next(zip(*structure))))
print (list(chain.from_iterable(zip(*zipped))))
[(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]
Ok a breakdown of the witchcraft:
# transpose sub arrays so column 0 is the first two sub elements from
# each sub array
In [4]: start = zip(*structure)
In [5]: start
Out[5]:
[(array([[1, 2, 3],
[4, 5, 6]]), array([[13, 14, 15],
[16, 17, 18]])), (array([[ 7, 8, 9],
[10, 11, 12]]), array([[19, 20, 21],
[22, 23, 24]]))]
# our interesting sub array's i.e colunm[0]
In [6]: first_col = next(start)
In [7]: first_col
Out[7]:
(array([[1, 2, 3],
[4, 5, 6]]), array([[13, 14, 15],
[16, 17, 18]]))
# pair up corresponding sub array's
In [8]: intersting_pairs = zip(*first_col)
In [9]: intersting_pairs
Out[9]:
[(array([1, 2, 3]), array([13, 14, 15])),
(array([4, 5, 6]), array([16, 17, 18]))]
# pair them up (1, 13), (2, 14) ...
In [10]: create_final_pairings = [zip(*ele) for ele in intersting_pairs]
In [11]: create_final_pairings
Out[11]: [[(1, 13), (2, 14), (3, 15)], [(4, 16), (5, 17), (6, 18)]]
Finally chain all into a single flat list and get the order correct:
In [13]: from itertools import chain
# create flat list
In [14]: flat_list = list(chain.from_iterable(zip(*create_final_pairings))
In [15]: flat_list
Out[15]: [(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]
A simple example of transposing with zip may help:
In [17]: l = [[1,2,3],[4,5,6]]
In [18]: zip(*l)
Out[18]: [(1, 4), (2, 5), (3, 6)]
In [19]: zip(*l)[0]
Out[19]: (1, 4)
In [20]: zip(*l)[1]
Out[20]: (2, 5)
In [21]: zip(*l)[2]
Out[21]: (3, 6)
For python2 you can use itertools.izip:
from itertools import chain, izip
zipped = (izip(*ele) for ele in izip(*next(izip(*structure))))
print (list(chain.from_iterable(izip(*zipped))))
[(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]
Upvotes: 2
Reputation: 54242
I had to write the non-list-comprehension version first to get my head around this:
new_training_vector = []
for m1, m2 in zip(structure[0], structure[1]):
for t1, t2 in zip(m1, m2):
for d1, d2 in zip(t1, t2):
new_training_vector.append([d1, d2])
The way it works is by creating two parallel iterators (using zip
), one for each model, then creating two parallel iterators for each of the training sets and so on until we get to the actual data and can just stick it together.
Once we have that, it's not hard to go fold it into a list comprehension:
new_training_vector = [[d1, d2]
for m1, m2 in zip(structure[0], structure[1])
for t1, t2 in zip(m1, m2)
for d1, d2 in zip(t1, t2)]
You can also do this with a dictionary, if that works better for some reason. You would lose the order though:
import collections
d = collections.defaultdict(list)
for model in structure:
for i, training_set in enumerate(model):
for j, row in enumerate(training_set):
for k, point in enumerate(row):
d[(i, j, k)].append(point)
The trick to this one is that we just keep track of where we saw each point (except for at the model level), so they automatically go into the same dict
item.
Upvotes: 2