Different approaches for applying SVM in Keras

Question

I want to build a multi-class classification model using Keras. My data is containing 7 features and 4 labels. If I am using Keras I have seen two ways to apply the Support vector Machine (SVM) algorithm.

First: A Quasi-SVM in Keras By using the (RandomFourierFeatures layer) presented here I have built the following model:

def create_keras_model():
  initializer = tf.keras.initializers.GlorotNormal()
  return tf.keras.models.Sequential([
                            layers.Input(shape=(7,)),
                            RandomFourierFeatures(output_dim=4822, kernel_initializer=initializer),
                            layers.Dense(units=4, activation='softmax'),
                            ])

Second: Using the last layer in the network as described here as follows:

def create_keras_model():
  return tf.keras.models.Sequential([
            tf.keras.layers.Input(shape=(7,)),
            tf.keras.layers.Dense(64),
            tf.keras.layers.Dense(4, kernel_regularizer=l2(0.01)),
            tf.keras.layers.Softmax()
                        
  ])

note: CategoricalHinge() was used as the loss function. My question is: are these approaches appropriate and can be defined as applying of SVM model or it is just an approximation of the model architecture? in short, can I say this is applying of SVM model?

I&#39;mahdi · Accepted Answer

You can check two models on your data like below:

I check on mnist dataset and get the below result:

Less overfitting with the second approach
Fast training time with the first approach
Less trainable params with the first approach
Accuracy for two approaches same as each other

from keras.utils.layer_utils import count_params  
import matplotlib.pyplot as plt
import tensorflow as tf
import seaborn as sns
import pandas as pd
import time


def create_model(approach):

    model = tf.keras.Sequential()
    model.add(tf.keras.Input(shape=(784,)))
    if  approach == 'Quasi_SVM':
        model.add(tf.keras.layers.experimental.RandomFourierFeatures(
            output_dim=4096, scale=10.0, 
            kernel_initializer="gaussian"))
        model.add(tf.keras.layers.Dense(10))


    if approach == 'kernel_regularizer':
        model.add(tf.keras.layers.Dense(128, activation='relu'))
        model.add(tf.keras.layers.Dense(64, activation='relu'))
        model.add(tf.keras.layers.Dense(32, activation='relu'))
        model.add(tf.keras.layers.Dense(16, activation='relu'))
        model.add(tf.keras.layers.Dense(10, 
                                        kernel_regularizer = tf.keras.regularizers.l2(0.01), 
                                        activation='softmax')) 
    

    model.compile(
        optimizer = 'adam',
        loss = 'hinge',
        metrics=['accuracy'],
    )

    return model


(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

x_train = x_train.reshape(-1, 784).astype("float32") / 255
x_test = x_test.reshape(-1, 784).astype("float32") / 255

y_train = tf.keras.utils.to_categorical(y_train)
y_test = tf.keras.utils.to_categorical(y_test)

for approach in ['Quasi_SVM', 'kernel_regularizer']:

    model = create_model(approach)
    start = time.time()
    history = model.fit(x_train, y_train, epochs=30, batch_size=128, validation_split=0.2)
    print(f'Training time {approach} : {time.time() - start} sec')
    print(f'Trainable params {approach} : {count_params(model.trainable_weights)}')
    print(f'Accuracy on x_test {approach} : {model.evaluate(x_test, y_test, verbose=0)[1]}')

    
    df = pd.DataFrame(history.history).rename_axis('epoch').reset_index().melt(id_vars=['epoch'])
    fig, axes = plt.subplots(1,2, figsize=(18,6))
    for ax, mtr in zip(axes.flat, ['loss', 'accuracy']):
        ax.set_title(f'{approach} {mtr.title()} Plot')
        dfTmp = df[df['variable'].str.contains(mtr)]
        sns.lineplot(data=dfTmp, x='epoch', y='value', hue='variable', ax=ax)

    fig.tight_layout()
    plt.show()

Output: (benchmark on colab)

Training time Quasi_SVM : 43.78484082221985 sec
Trainable params Quasi_SVM : 40970
Accuracy on x_test Quasi_SVM : 0.9729999899864197
Training time kernel_regularizer : 45.47012114524841 sec
Trainable params kernel_regularizer : 111514
Accuracy on x_test kernel_regularizer : 0.972100019454956

Different approaches for applying SVM in Keras

Answers (2)

Related Questions