haoran xi
haoran xi

Reputation: 1

The DNN network requires the input data is 2D, but my training data is rgb (3D)

I'm new to the deep learning. I'm trying to train an online-setting deep neural network on ucf101 dataset. Before the training, I read the usage example given by author. The dnn network requires the training data in 2D. But the video frames extracted from the video data is 3D and they are rgb images. For that requirement, I reshape each image to tensors in dimension (1, WxHxC), but the training accuracy is too low. what should I do?

The first code cell is usage example. The second is mine and the third is training loss. The training data in usage example is (15000,10), which means dimension of each sample is (1,10). My training data's dimension is (160,160,10)

onn_network = ONN(features_size=10, max_num_hidden_layers=5, 
                  qtd_neuron_per_hidden_layer=40, n_classes=10)
X, Y = make_classification(n_samples=50000, n_features=10, n_informative=4, n_redundant=0, n_classes=10,
                           n_clusters_per_class=1, class_sep=3)
X_train, X_test, y_train, y_test = train_test_split(X,Y, test_size=0.3, 
                                                    random_state=42, shuffle=True)

for i in range(len(X_train)):
  onn_network.partial_fit(np.asarray([X_train[i, :]]), np.asarray([y_train[i]]))
  
  if i % 1 == 0:
    predictions = onn_network.predict(X_test)
    print(" Online Accuracy: {}".format(balanced_accuracy_score(y_test, predictions)))
X_train=np.load('ucf101_X_train_3channels.npy')
X_test=np.load('ucf101_X_test_3channels.npy')
y_train=np.load('ucf101_y_train_3channels.npy')
y_test=np.load('ucf101_y_test_3channels.npy')
X_test=X_test.reshape(7385,160*160*3)
onn_network = ONN(features_size=160*160*3, max_num_hidden_layers=20, qtd_neuron_per_hidden_layer=100, 
                  n_classes=101)
for i in range(len(X_train)):
    
    onn_network.partial_fit(np.asarray([X_train[i]]).reshape(1,160*160*3), 
                            np.asarray(y_train[i]).reshape(1,))
  
    if i % 100 == 0:
        predictions = onn_network.predict(X_test)
        print("Online Accuracy: {}".format(balanced_accuracy_score(y_test, predictions)))
Online Accuracy: 0.0244914025343677
Online Accuracy: 0.02107729631350257
Online Accuracy: 0.02098160986186016
Online Accuracy: 0.025212566301405722
Online Accuracy: 0.02635510928009669
Online Accuracy: 0.025764109717447806
Online Accuracy: 0.026325141135435725
Online Accuracy: 0.018179492570127884
Online Accuracy: 0.025331778179893114
Online Accuracy: 0.02743639553656709
WARNING: Set 'show_loss' to 'False' when not debugging. It will deteriorate the fitting performance.
Alpha:[0.18638033 0.07175495 0.04563371 ... 0.04181847 0.04256647 0.03721857]
Training Loss: 4.5786605
Online Accuracy: 0.028941511549095487
Online Accuracy: 0.031018418116356354
Online Accuracy: 0.01979826579435767
Online Accuracy: 0.03229507089186072
Online Accuracy: 0.024989337301466186
Online Accuracy: 0.03374924564377245

Upvotes: 0

Views: 271

Answers (2)

Theodor Peifer
Theodor Peifer

Reputation: 3506

So it seems like the input images are RGB but you want to use 2D images (which means gray-scale, colors are not that important for that task). You cant just reshape a 3D image to 2D, you have to remove two of the 3 channel, for example remove GB from RBG and just keep the red channel. You want to go from (3, W, H) to (1, W, H). Here is an example how to remove channels:


# let's say img is a 3D numpy array
image = image[:, :, 0]
# image is now 2D (red channel only)

Upvotes: 0

Mughees
Mughees

Reputation: 953

you can use torch.flatten() instead of reshaping. Secondly, share the link of Repo you are using. Your question is not clear.

Upvotes: -1

Related Questions