junfanbl
junfanbl

Reputation: 461

How to Input my Training Data into this Neural Network

I'm trying to solve a classification problem with a specific piece of code, and I'm having trouble understanding exactly how my data is to be fed into the Neural Network.

I started encoding the data using 1-of-C dummy-encoding so I can preserve categorical context in the data. I haven't finished encoding the data completely as I don't fully understand how to utilize the code at hand for input.

Here is a sample of my encoded data thus far:

'In Raw format, for predicting Political Party Affiliation
   'Age Sex     Income    Area      Party
[0] 30  male    38000.00  urban     democrat
[1] 36  female  42000.00  suburban  republican
[2] 52  male    40000.00  rural     independent
[3] 42  female  44000.00  suburban  other

'Encoded Format
[0] -1.23  -1.0  -1.34  ( 0.0   1.0)  (0.0  0.0  0.0  1.0)
[1] -0.49   1.0   0.45  ( 1.0   0.0)  (0.0  0.0  1.0  0.0)
[2]  1.48  -1.0  -0.45  (-1.0  -1.0)  (0.0  1.0  0.0  0.0)
[3]  0.25   1.0   1.34  ( 1.0   0.0)  (1.0  0.0  0.0  0.0)

I used Gaussian Normalization for the numeric data, 1-of-C dummy-encoding, and 1-of-(C-1) encoding for the String data. The last column of data are the categories.

Taking the code below into consideration; the Input Variable X accepts data in the following format:

X=np.array([[1,0,1,0],[1,0,1,1],[0,1,0,1]])

Do I input my data like so until I've looped through all of it?

X=np.array([[-1.23,-1,-1.34,0010],[00000010,-.49,1,.45],[1000,00001000,1.48,-1]])

I've read the following SO question: How is input dataset fed into neural network? that has helped clarify the process. How features should be fed in row by row, with the Target feature/Label (political party in this case) as the last feature per row. This makes sense to me. In the code posted I'm assuming that the Variable Y is the target.

With that in mind, should my input be this:

X=np.array([[-1.23,-1,-1.34,0010],[00000010,0,0,0],[0,0,0,0]])

Where I only capture the first row, with Target feature as the last input?

I'm not sure which one it should be. Thank you for any help in advance.

import numpy as np

#Input array
 X=np.array([[1,0,1,0],[1,0,1,1],[0,1,0,1]])

#Output
 y=np.array([[1],[1],[0]])

#Sigmoid Function
 def sigmoid (x):
 return 1/(1 + np.exp(-x))

#Derivative of Sigmoid Function
 def derivatives_sigmoid(x):
 return x * (1 - x)

#Variable initialization
 epoch=5000 #Setting training iterations
 lr=0.1 #Setting learning rate
 inputlayer_neurons = X.shape[1] #number of features in data set
 hiddenlayer_neurons = 3 #number of hidden layers neurons
 output_neurons = 1 #number of neurons at output layer

#weight and bias initialization
 wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
 bh=np.random.uniform(size=(1,hiddenlayer_neurons))
 wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
 bout=np.random.uniform(size=(1,output_neurons))

for i in range(epoch):

#Forward Propogation
 hidden_layer_input1=np.dot(X,wh)
 hidden_layer_input=hidden_layer_input1 + bh
 hiddenlayer_activations = sigmoid(hidden_layer_input)
 output_layer_input1=np.dot(hiddenlayer_activations,wout)
 output_layer_input= output_layer_input1+ bout
 output = sigmoid(output_layer_input)

#Backpropagation
 E = y-output
 slope_output_layer = derivatives_sigmoid(output)
 slope_hidden_layer = derivatives_sigmoid(hiddenlayer_activations)
 d_output = E * slope_output_layer
 Error_at_hidden_layer = d_output.dot(wout.T)
 d_hiddenlayer = Error_at_hidden_layer * slope_hidden_layer
 wout += hiddenlayer_activations.T.dot(d_output) *lr
 bout += np.sum(d_output, axis=0,keepdims=True) *lr
 wh += X.T.dot(d_hiddenlayer) *lr
 bh += np.sum(d_hiddenlayer, axis=0,keepdims=True) *lr

print output

Upvotes: 1

Views: 306

Answers (1)

ege
ege

Reputation: 774

great question.

First off use pre-built Neural Network implementations from SciKit Learn

Next, split your data in to features and labels (flatten your input vector first)

X_features=X[:,:-1]
X_labels=X[:,-1]

Then setup the SciKit MLP

model=MLPClassifier(args...)

Fit your data

model.fit(X_features,X_labels)

Voila...

Now you can predict a new input with

Y=model.predict(input_vector)

Nb: in the name of true data science remember to split your data in to a training and verification set (eg 90/10)

Upvotes: 1

Related Questions