eric lardon
eric lardon

Reputation: 359

concatenate arrays on axis=3 while first dimension is different

l have about 20 different data shape. l would like to concatenate them on axis=3

data_1=dim(1000,150,10)
data_2=dim(1000,150,10)
data_3=dim(1000,150,10)
data_4=dim(1000,150,10)

and

features_1=dim(1000,150,10)
features_2=dim(1000,150,10)
features_3=dim(1000,150,10)
features_4=dim(1000,150,10)

l concatenate them to data and features variable

hence

data.shape= (4,1000,150,10)

and

features.shape=(4,1000,150,10)

What l want to do ?

concatenate data and features on axis=3 in a variable called data_concat

so that data_concat.shape=(4,1000,150,20)

to do so l did the following :

data_concat = np.concatenate((data,features),axis=3)

However when it doesn't work when the first dimension is not the same . For instance :

data_1=dim(1000,150,10)
data_2=dim(1200,150,10)
data_3=dim(800,150,10)
data_4=dim(400,150,10)

and

features_1=dim(1000,150,10)
features_2=dim(1200,150,10)
features_3=dim(800,150,10)
features_4=dim(400,150,10)

hence

data.shape= (4,)

and

features.shape=(4,)

Doing :

data_concat = np.concatenate((data,features),axis=3)

doesn't work because concatenate doesn't see axis=3 since

data.shape= (4,)

and

features.shape=(4,)

Thank you

Upvotes: 1

Views: 1346

Answers (2)

Gianluca Micchi
Gianluca Micchi

Reputation: 1653

Due to all the mathematical assumptions underlying them, numpy arrays must have a clearly defined shape. If that's not the case, numpy defines an array of lists, as it is the case with your second example: As you have noticed, here you can not use np.concatenate on axis=3 because the array is treated as uni-dimensional.

Maybe, you could get something closer to your intention if you concatenate separately each data variable with its corresponding features variable like

df_1 = np.concatenate((data_1, features_1), axis=2)
df_2 = np.concatenate((data_2, features_2), axis=2)
df_3 = np.concatenate((data_3, features_3), axis=2)
df_4 = np.concatenate((data_4, features_4), axis=2)

data = [df_1, df_2, df_3, df_4]

From your data, however, I notice that second and third dimension are always the same. This looks to me like you are trying to put together several batches of different length containing the same data. If that's the case, why not concatenating data_1, data_2 ecc. on the 0-th axis? That would create no problem to numpy.

Upvotes: 1

Paul Panzer
Paul Panzer

Reputation: 53029

This can be done either by list comprehension or if the result should be an array with np.frompyfunc:

# create example
>>> data = np.array([np.arange(n*12).reshape(n, 2, 6) for n in range(2, 5)])
>>> features = np.array([np.ones((n, 2, 6), int) for n in range(2, 5)])
>>> data.shape, features.shape
((3,), (3,))
>>> 
# list comprehension
>>> [np.concatenate(xy, 2) for xy in zip(data, features)]
[array([[[ 0,  1,  2,  3,  4,  5,  1,  1,  1,  1,  1,  1],
        [ 6,  7,  8,  9, 10, 11,  1,  1,  1,  1,  1,  1]],

       [[12, 13, 14, 15, 16, 17,  1,  1,  1,  1,  1,  1],
        [18, 19, 20, 21, 22, 23,  1,  1,  1,  1,  1,  1]]]), array([[[ 0,  1,  2,  3,  4,  5,  1,  1,  1,  1,  1,  1],
        [ 6,  7,  8,  9, 10, 11,  1,  1,  1,  1,  1,  1]],

       [[12, 13, 14, 15, 16, 17,  1,  1,  1,  1,  1,  1],
        [18, 19, 20, 21, 22, 23,  1,  1,  1,  1,  1,  1]],

       [[24, 25, 26, 27, 28, 29,  1,  1,  1,  1,  1,  1],
        [30, 31, 32, 33, 34, 35,  1,  1,  1,  1,  1,  1]]]), array([[[ 0,  1,  2,  3,  4,  5,  1,  1,  1,  1,  1,  1],
        [ 6,  7,  8,  9, 10, 11,  1,  1,  1,  1,  1,  1]],

       [[12, 13, 14, 15, 16, 17,  1,  1,  1,  1,  1,  1],
        [18, 19, 20, 21, 22, 23,  1,  1,  1,  1,  1,  1]],

       [[24, 25, 26, 27, 28, 29,  1,  1,  1,  1,  1,  1],
        [30, 31, 32, 33, 34, 35,  1,  1,  1,  1,  1,  1]],

       [[36, 37, 38, 39, 40, 41,  1,  1,  1,  1,  1,  1],
        [42, 43, 44, 45, 46, 47,  1,  1,  1,  1,  1,  1]]])]

# frompyfunc
>>> np.frompyfunc(lambda *xy: np.concatenate(xy, 2), 2, 1)(data, features)
array([array([[[ 0,  1,  2,  3,  4,  5,  1,  1,  1,  1,  1,  1],
        [ 6,  7,  8,  9, 10, 11,  1,  1,  1,  1,  1,  1]],

       [[12, 13, 14, 15, 16, 17,  1,  1,  1,  1,  1,  1],
        [18, 19, 20, 21, 22, 23,  1,  1,  1,  1,  1,  1]]]),
       array([[[ 0,  1,  2,  3,  4,  5,  1,  1,  1,  1,  1,  1],
        [ 6,  7,  8,  9, 10, 11,  1,  1,  1,  1,  1,  1]],

       [[12, 13, 14, 15, 16, 17,  1,  1,  1,  1,  1,  1],
        [18, 19, 20, 21, 22, 23,  1,  1,  1,  1,  1,  1]],

       [[24, 25, 26, 27, 28, 29,  1,  1,  1,  1,  1,  1],
        [30, 31, 32, 33, 34, 35,  1,  1,  1,  1,  1,  1]]]),
       array([[[ 0,  1,  2,  3,  4,  5,  1,  1,  1,  1,  1,  1],
        [ 6,  7,  8,  9, 10, 11,  1,  1,  1,  1,  1,  1]],

       [[12, 13, 14, 15, 16, 17,  1,  1,  1,  1,  1,  1],
        [18, 19, 20, 21, 22, 23,  1,  1,  1,  1,  1,  1]],

       [[24, 25, 26, 27, 28, 29,  1,  1,  1,  1,  1,  1],
        [30, 31, 32, 33, 34, 35,  1,  1,  1,  1,  1,  1]],

       [[36, 37, 38, 39, 40, 41,  1,  1,  1,  1,  1,  1],
        [42, 43, 44, 45, 46, 47,  1,  1,  1,  1,  1,  1]]])], dtype=object)

Upvotes: 1

Related Questions