Reputation: 311
I've been following a convolution neural network tutorial using Tensorflow from YouTube videos and the Tensorflow CNN tutorial on the MNIST data-set. I used these tutorials to create my own CNN on audio data. The goal is to identify the voice out of 33 speakers using CNN. The data has been wrangled already so that it's shape of the test set is (8404, 1, 500, 1) so that convolution can be applied. Each segment of audio is 500 long and there are 8404 samples in the test set. My problem is at the training step. I get the following error:
ValueError: Cannot feed value of shape (128, 1, 500, 1) for Tensor 'Placeholder:0', which has shape '(?, 500)'
I googled this ValueError and people have solved this by reshaping the batch_x to the expected dimensions. So I tried the following line of code:
batch_x = np.reshape(batch_x, [-1, 500])
I had no luck with this reshape. Has anyone delt with this problem? Below is the code.
import numpy as np
import tensorflow as tf
npzfile = np.load('saved_data_file_33.npz')
train_segs = npzfile['train_segs'] # Seg train data
train_labels = npzfile['train_labels'] # train labels
train_labels_1h = npzfile['train_labels_1h'] # One hot encoding for training data
epochs = 1
batch_size = 128
learning_rate = 0.01
classes = len(train_labels_1h[0,:]) # 33 classes
seg_size = len(test_segs[0,0,:,0]) # 500 long
x = tf.placeholder(tf.float32, [None, seg_size])
y = tf.placeholder(tf.float32)
# This section is initializing the weights and biases of each hidden layer and output layer with random values.
# These values are stores in a dict for easy access.
weights = {"conv1" : tf.Variable(tf.random_normal([5, 5, 1, 32])),
"conv2": tf.Variable(tf.random_normal([5, 5, 32, 64])),
"fc_layer": tf.Variable(tf.random_normal([1*125*64, 1024])),
"output": tf.Variable(tf.random_normal([1024, classes]))
}
biases = { "b_c1" : tf.Variable(tf.random_normal([32])),
"b_c2" : tf.Variable(tf.random_normal([64])),
"b_fc" : tf.Variable(tf.random_normal([1024])),
"output": tf.Variable(tf.random_normal([classes]))
}
reshapedX = tf.reshape(x, [-1, 1, 500, 1])
conv1 = tf.nn.conv2d(reshapedX, weights["conv1"], strides = [1, 1, 1, 1], padding = "SAME")
conv1 = tf.nn.relu(conv1 + biases["b_c1"])
conv1 = tf.nn.max_pool(conv1, ksize = [1, 1, 2, 1], strides = [1, 1, 2, 1], padding = "SAME")
conv2 = tf.nn.conv2d(conv1, weights["conv2"], strides = [1, 1, 1, 1], padding = "SAME")
conv2 = tf.nn.relu(conv2 + biases["b_c2"])
conv2 = tf.nn.max_pool(conv2, ksize = [1, 1, 2, 1], strides = [1, 1, 2, 1], padding = "SAME")
fc = tf.reshape(conv2, [-1, 1*125*64])
fc = tf.nn.relu(tf.matmul(fc, weights["fc_layer"]) + biases["b_fc"])
output_layer = tf.matmul(fc, weights["output"]) + biases["output"]
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=output_layer))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(output_layer, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(epochs):
j = 0
while j < len(train_segs):
start = i
end = i + batch_size
batch_x = np.array(train_segs[start:end])
batch_y = np.array(train_labels[start:end])
#batch_x = np.reshape(batch_x, [-1, 500]) # reshape for x input. s
train_accuracy = accuracy.eval(feed_dict={x: batch_x, y: batch_y})
print('step %d, training accuracy %g' % (i, train_accuracy))
train_step.run(feed_dict={x: batch_x, y: batch_y})
print('test accuracy %g' % accuracy.eval(feed_dict={
x: train_segs, y: train_labels}))
Upvotes: 0
Views: 43
Reputation: 824
It looks like you want to remove dimensions of size 1 from train_segs
. You can use train_segs = np.squeeze(train_segs)
for this.
Also, I think you're using the wrong brackets for np.reshape
, so np.reshape(batch_x, (-1, 500))
might have worked. In general you need to be careful with reshape
functions, because the order of your elements might not end up the way you expect.
Upvotes: 1