What happens when using 3 indexes to slice a 4-dimensional array?

Question

I'm trying to understand why the accuracy of my algorithm has suddenly changed quite dramatically. One small change I did what that I added a forth : when I discovered that I was using only 3 indexes when standardizing my 4-dimensional train/test set. And now I'm curious - would the below old/new code do the same? If not, how does indexing into a 4-dimensional array using only 3 indexes work?

Old:

   # standardize all non-binary variables
   channels = 14 # int(X.shape[1])
   mu_f     = np.zeros(shape=channels)
   sigma_f  = np.zeros(shape=channels)

   for i in range(channels):
      mu_f[i]    = np.mean(X_train[:,i,:])
      sigma_f[i] = np.std(X_train[:,i,:])   

   for i in range(channels):
      X_train[:, i, :]  -= mu_f[i]   
      X_test[:, i, :]   -= mu_f[i]

      if (sigma_f[i] != 0):
         X_train[:, i, :]  /= sigma_f[i]
         X_test[:, i, :]   /= sigma_f[i]

New:

   # standardize all non-binary variables
   channels = 14
   mu_f     = np.zeros(shape=channels)
   sigma_f  = np.zeros(shape=channels)

   for i in range(channels):
      mu_f[i]    = np.mean(X_train[:,i,:,:])
      sigma_f[i] = np.std(X_train[:,i,:,:])   

   for i in range(channels):
      X_train[:, i, :, :]  -= mu_f[i]   
      X_test[:, i, :, :]   -= mu_f[i]

      if (sigma_f[i] != 0):
         X_train[:, i, :, :]  /= sigma_f[i]
         X_test[:, i, :, :]   /= sigma_f[i]

hpaulj · Accepted Answer

I don't see why the extra : makes a difference. It doesn't when I do time tests on a simple np.mean(X[:,1]) v np.mean(X,1,:,:], etc.

As for plonser's suggestion that you can vectorize the whole thing, the key is realizing that mean and std take some added parameters. Check their docs and play around with sample arrays.

Xmean = np.mean(X,axis=(0,2,3),keepdims=True)
X -= Xmean
X /= Xmean

What happens when using 3 indexes to slice a 4-dimensional array?

Answers (1)

Related Questions