Reputation: 749
I am currently working on a churn management problem with a binary classification.
I am getting an error when I am fitting the model. There seems to be something in the input/output that is throwing this off that I haven't been able to catch.
Here is the code (df contains my data frame which I am creating vectors from):
#Delete unimportant columns.
del df['RowNumber']
del df['CustomerId']
del df['Surname']
Then, I have to convert two categorical variables:
#Converting and creating dummy variables for categorical variables
df["Gender"] = df["Gender"].astype('category')
df["Geography"] = df["Geography"].astype('category')
df['Gender'] = pd.get_dummies(df['Gender'])
df['Geography'] = pd.get_dummies(df['Geography'])
y = df.iloc[:, -1] #Label variable
X = df.iloc[:, :10] #Features
Splitting dataset into test and training:
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)
Next, I am scaling the variables:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
print(X_train.shape) #(8000, 10)
print(X_test.shape) #(2000, 10)
print(y_train.shape)#(8000,)
print(y_test.shape)#(2000,)
Building Network:
model = models.Sequential()
model.add(layers.Dense(32, activation='relu', input_shape=(8000,)))
model.add(layers.Dense(1, activation='sigmoid'))
# Compiling Neural Network
model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
#Fitting our model
model.fit(X_train, y_train, batch_size = 10, epochs = 10)
# Predicting the Test set results
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)
# Creating the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
The specific error code that I am getting is:
ValueError: Error when checking input: expected dense_47_input to have shape (None, 8) but got array with shape (8000, 10)
Any help to combat this issue would be great!
Edit: model summary before model.compile:
Edit2: model summary after compiling:
Upvotes: 2
Views: 1412
Reputation: 1580
I think you need to correct this:
model.add(layers.Dense(32, activation='relu', input_shape=(10,)))
10 is your number of features used. Keras will automatically take the number of rows in the batch/dataset.
Edit Explanation:
from keras import models
from keras import layers
model = models.Sequential()
model.add(layers.Dense(32, input_shape=(10,)))
model.add(layers.Dense(1))
Here, The first layers is created that will only accept a 2D tensor with the dimension is 10 (the zero-th dimension, the batch dimension, is unspecified and thus any value would be accepted).
Thus this layer can only be connected to an upstream that expects 32-dimensional vectors as its input. When using Keras you don’t have to worry about compatibility, because the layers that you add to your models are dynamically built to match the shape of the incoming layer.
The second layer did not receive an input shape argument—instead it automatically inferred its input shape as being the output shape of the layer that came before.
Decoding parameter values in the model:
Suppose this is my model:
from keras import models
from keras import layers
model = models.Sequential()
model.add(layers.Dense(32, input_shape=(2,)))
model.add(layers.Dense(1))
model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
print model.summary()
And this is the model summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 32) 96
_________________________________________________________________
dense_2 (Dense) (None, 1) 33
=================================================================
Total params: 129
Trainable params: 129
Non-trainable params: 0
_________________________________________________________________
For Dense Layer, we need to calculate this:
output = dot(W, input) + b
or
output = relu(dot(W, input) + b) #relu here is the activation function
In this expression, W and b are tensors which are attributes of the layer. They are called the "weights", or "trainable parameters" of the layer (the kernel and bias attributes, respectively). These weights contain the information learned by the network from exposure to training data.
For Layer 1 (Parameters=96) = Hidden_units * Dimension_of_input_data + bias_value
96 = 32 (Hidden Units) * 2 (Data Dimension) + 32 (Bias Value Same as Hidden Units)
For Layer 2 (Parameters=33) = Hidden_units * Dimension_of_data + bias_value
33 = 1 (Hidden Units) * 32 (Data Dimension) + 1 (Bias Value Same as Hidden Units)
Total Params = 96+33 = 129
Hope this helps :)
Source of Explanation: Keras Documentation
Upvotes: 3