Reputation: 363
I am working with Tensorflow & trying to build a deep network model. I will use ReLU activation with SGD/Adam optimizer (minimization of cross-entropy) & L2 regularizer (penalize large weights to over-fit).
My dataset has 115599 rows with 13 columns, from which 1:12 columns are actually input (X) & 13th column is the binary response. I have standardized the input X.
For weights & bias, we are supposed to take sample from a Gaussian distribution with zero mean & variance 1. Previously when I am doing it with MNIST dataset, we are setting the weights & biases at zero with W = tf.Variable(tf.zeros([784, 10]))
& b = tf.Variable(tf.zeros([10]))
as the response had 10 different levels (0-9).
My question is, how can I specify the weights for a binary response which has only two different levels. Should I put b = tf.Variable(tf.zeros([2]))?
The code I tried so far placed below;
import tensorflow as tf
import numpy
import pandas as pd
df_X=pd.read_csv('/home/prm/use_validation.csv',usecols = [0,1,2,3,4,5,6,7,8,9,10,11],skiprows=[0],header=None)
df_scale = (df_X - df_X.min()) / (df_X.max() - df_X.min())
d = df_scale.values
Response = pd.read_csv('/home/prm/use_validation.csv',usecols = [12],skiprows=[0],header=None)
labels = Response.values
data_use = numpy.float32(d)
labels = numpy.array(Response,'str')
#tensorflow
x = tf.placeholder(tf.float32,shape=(115599, 12))
x = data_use
w = tf.random_normal([100,115599],mean=0.0, stddev=1.0, dtype=tf.float32)
b = tf.random_normal([100,2],mean=0.0, stddev=1.0, dtype=tf.float32) ##[NOT SURE, PLEASE ASSIST]##
y = tf.nn.softmax(tf.matmul(w,x)+b)
Thanks in advance!!
Upvotes: 0
Views: 617
Reputation: 363
I have figured it out;
import tensorflow as tf
import numpy
import pandas as pd
df_X=pd.read_csv('/home/prm/use_validation.csv',usecols = [0,1,2,3,4,5,6,7,8,9,10,11],skiprows=[0],header=None)
df_scale = (df_X - df_X.min()) / (df_X.max() - df_X.min())
d = df_scale.values
l = pd.read_csv('/home/prm/use_validation.csv',usecols = [12],skiprows=[0],header=None)
labels = l.values
data_use = numpy.float32(d)
labels = numpy.array(l,'str')
#print data, labels
#tensorflow
x = tf.placeholder(tf.float32,shape=(115599, 12))
x.shape
# x = data_use
w = tf.random_normal([12,1],mean=0.0, stddev=1.0, dtype=tf.float32)
b = tf.Variable(tf.zeros([1]))
w.shape
y = tf.nn.softmax(tf.matmul(x,w)+b)
This is working fine for me!
Upvotes: 0
Reputation: 1114
The shapes in your model do not match up. Keep in mind that if you have tensors A
and B
with shapes
shape(A) = [a1, a2]
shape(B) = [b1, b2]
then to perform
C = tf.matmul(A, B)
you MUST have b1 = a2
, and the resulting tensor C
has shape
shape(C) = [a1, b2]
In your example, A
corresponds to x
which has shape [115599, 12]
, and B
corresponds to w
, which you want to determine the shape, and C
corresponds to y
, which must have the same shape of the target, which is [115599, 1]
.
You obtain that w
must have shape [12, 1]
, while b
must have rank 1 and the same shape as the second shape of w
, so b
must have shape [1,]
.
Upvotes: 1