Jovan
Jovan

Reputation: 815

Kernel Size Defining and Activation Function in Keras

I am still learning to adapt using Keras, (Sorry if the question may Stupid), As the Title Says, sometimes I found myself these similiar codes when constructing Conv2D or Conv3D Model in Keras:

x = Conv2D(16, 3, activation='relu', padding='same')(input_img)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = Conv3D(16, (3, 3, 3), activation='relu', padding='same')(input_img)

I am just uncertain about it, but my believe it constructs a same Kernel Size. Or is it wrong? and also I often found either using 'Sigmoid' or 'Softmax' Activation Function at making the last layer Model, like:

x = Dense(784, activation='sigmoid')(x)
x = Dense(784, activation='softmax')(x)

What is the best time to use either Sigmoid/Softmax Activation Function?

Upvotes: 1

Views: 373

Answers (2)

The convolutional kernel (filter) depends on the type of convolution. If you use images, a 2D convolutional layer is the one you have to use. On the other hand, if for example it is video, a 3D would be the right thing.

In the kernel argument you have to indicate the size that this filter will have, that is, what size it will be in each of the convolution dimensions.

As for the activation functions, you will have to assess what type of data you handle. Ideally, if you handle multiclass data, the output layer should have a softmax function. In the intermediate layers the sigmoid function works well, just like a ReLU or any other. But always based on your data. Some are more aggressive than others.

Upvotes: 1

czr
czr

Reputation: 658

The kernel size parameter can receive an integer of a tuple value. if it is an integer it applies this value to all sides of the kernel.
So, in the first two examples, it a 3x3 2d kernel, in the last one it is 3x3x3 3d kernel.

You would use softmax if you have multiple classes in your output, but the predicted value can belong to only to one class. Softmax is used when you have 1 or more output classes and the predicted value can belong to any number of those classes.

Upvotes: 1

Related Questions