Reputation: 742
I am trying simple multinomial logistic regression using Keras, but the results are quite different compared to standard scikit-learn approach.
For example with iris data:
import numpy as np
import pandas as pd
df = pd.read_csv("./data/iris.data", header=None)
from sklearn.model_selection import train_test_split
df_train, df_test = train_test_split(df, test_size=0.3, random_state=52)
X_train = df_train.drop(4, axis=1)
y_train = df_train[4]
X_test = df_test.drop(4, axis=1)
y_test = df_test[4]
Using scikit-learn:
from sklearn.linear_model import LogisticRegression
scikit_model = LogisticRegression(multi_class='multinomial', solver ='saga', max_iter=500)
scikit_model.fit(X_train, y_train)
the average weighted f1-score on test set:
y_test_pred = scikit_model.predict(X_test)
from sklearn.metrics import classification_report
print(classification_report(y_test, y_test_pred, scikit_model.classes_))
is 0.96
.
Then with Keras:
from sklearn.preprocessing import LabelEncoder
from keras.utils import np_utils
# first we have to encode class values as integers
encoder = LabelEncoder()
encoder.fit(y_train)
y_train_encoded = encoder.transform(y_train)
Y_train = np_utils.to_categorical(y_train_encoded)
y_test_encoded = encoder.transform(y_test)
Y_test = np_utils.to_categorical(y_test_encoded)
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from keras.regularizers import l2
#model construction
input_dim = 4 # 4 variables
output_dim = 3 # 3 possible outputs
def classification_model():
model = Sequential()
model.add(Dense(output_dim, input_dim=input_dim, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
return model
#training
keras_model = classification_model()
keras_model.fit(X_train, Y_train, epochs=500, verbose=0)
the average weighted f1-score on test set:
classes = np.argmax(keras_model.predict(X_test), axis = 1)
y_test_pred = encoder.inverse_transform(classes)
from sklearn.metrics import classification_report
print(classification_report(y_test, y_test_pred, encoder.classes_))
is 0.89
.
Is it possible to perform identical (or at least as much as possible) logistic regression with Keras as with scikit-learn?
Upvotes: 5
Views: 2412
Reputation: 720
One obvious difference is saga
(a variant of SAG) is used in LogisticRegression
while SGD
is used in your NN. As far as I know, LogisticRegression
doesn't support SGD
. Alternatively you can use SGDRegressor or SGDClassifier instead of LogisticRegression
. And here is a blog discussing the differences between them.
Upvotes: 3
Reputation: 2017
I tried to run your examples and noticed a couple of potential sources:
Using your code, I was able to get accuracy ranging from .89 to .96 by running SGD with learning rate set to .05. When switching to Adam (also with this quite high learning rate), I got more stable results ranging from .92 to .96 (although this is more of an impression as I didn't run too many trials).
Upvotes: 3