Curunir
Curunir

Reputation: 1288

Python append 2D numpy array to 3D

I am running into an understanding problem of numpy arrays. I have a dataset which looks like this when read:

[
   [ F0, F1, F2, F3 ... F22],
   [ G0, G1, G2, G3 ... G22],
   [ H0, H1, H2, H3 ... H22],
   [ I0, I1, I2, I3 ... I22],
   [ J0, J1, J2, J3 ... J22]
]

And I want to transform those in "packs of three":

[
   [
      [ F0, F1, F2, F3 ... F22],
      [ G0, G1, G2, G3 ... G22],
      [ H0, H1, H2, H3 ... H22]
   ],
   [
      [ G0, G1, G2, G3 ... G22],
      [ H0, H1, H2, H3 ... H22],
      [ I0, I1, I2, I3 ... I22]
   ]
   ...
]

So far I have wrote this code:

data = loadtxt('./training_data/set_0.csv', delimiter=';')
batch_size=3
features=17
labels=6

trainX = np.empty((0,batch_size, features), float)
for i in range(0, len(data)-batch_size):
    row_X = data[i:i+batch_size,0:features]
    trainX = np.append(trainX, row_X)

print(trainX[0])

Logging the shape of row_X gives me (3,17) just as I wanted. But the trainX variable seems to contain the flat combination of those arrays, I would have expected the shape of trainX[0] to be (batch_size,features).

Upvotes: 0

Views: 195

Answers (2)

MaxPowers
MaxPowers

Reputation: 5486

Say you have a two-dimensional numpy array like the following:

arr = array([[7, 9, 4, 1, 0],
             [9, 5, 1, 8, 5],
             [6, 1, 9, 7, 1],
             [2, 8, 4, 8, 7],
             [1, 2, 6, 1, 8],
             [0, 2, 7, 0, 2]])

In order to transform this to a three-dimensional array you would use the reshape function:

arr.reshape(3, 2, -1)

which gives you three "batches" of two rows each.

array([[[7, 9, 4, 1, 0],
        [9, 5, 1, 8, 5]],

       [[6, 1, 9, 7, 1],
        [2, 8, 4, 8, 7]],

       [[1, 2, 6, 1, 8],
        [0, 2, 7, 0, 2]]])

The last argument -1 tells reshape, to calculate the size of the third dimension, given the actual number of elements in the array and sizes of the other dimensions.

Upvotes: 1

Chris
Chris

Reputation: 16147

import numpy as np

batch_size = 3


l = np.array([
   [1,2,3,4,5],
   [ 2,3,4,5,6],
   [ 3,4,5,6,7],
   [ 4,5,6,7,8],
   [ 5,6,7,8,9],
    [6,7,8,9,0],
])

output = [l[n:n+batch_size] for n in range(len(l)-batch_size+1)]

Output

[array([[1, 2, 3, 4, 5],
        [2, 3, 4, 5, 6],
        [3, 4, 5, 6, 7]]),
 array([[2, 3, 4, 5, 6],
        [3, 4, 5, 6, 7],
        [4, 5, 6, 7, 8]]),
 array([[3, 4, 5, 6, 7],
        [4, 5, 6, 7, 8],
        [5, 6, 7, 8, 9]]),
 array([[4, 5, 6, 7, 8],
        [5, 6, 7, 8, 9],
        [6, 7, 8, 9, 0]])]

Upvotes: 1

Related Questions