Reputation: 568
I have a regression model. I write code of this algorithm :
create 10 random splits of training data into training and validation data. Choose the best value of alpha from the following set: {0.1, 1, 3, 10, 33, 100, 333, 1000, 3333, 10000, 33333}.
To choose the best alpha hyperparameter value, you have to do the following:
• For each value of hyperparameter, perform 10 random splits of training data into training and validation data as said above.
• For each value of hyperparameter, use its 10 random splits and find the average training and validation accuracy.
• On a graph, plot both the average training accuracy (in red) and average validation accuracy (in blue) w.r.t. each hyperparameter setting. Comment on this graph by identifying regions of overfitting and underfitting.
• Print the best value of alpha hyperparameter.
2- Evaluate the prediction performance on test data and report the following: • Total number of non-zero features in the final model. • The confusion matrix • Precision, recall and accuracy for each class.
Finally, discuss if there is any sign of underfitting or overfitting with appropriate reasoning
I write This code :
print('Accuracy of logistic regression classifier on test set: {:.2f}'.format(Newclassifier.score(X_test, y_test)))
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
My Questions is : 1- why accuracy in each iteration decrease? 2- is My model Overfit or underfit? 3- does My model work right?
Upvotes: 0
Views: 2938
Reputation: 29099
There is no official/absolute metric for deciding whether you are underfitting, overfitting of neither. In practice
In you case, your training and testing error seem to go in parallel, so you don't seem to have a problem with overfitting. Your model could be underfitting, so you could try with a more complex model. However, it is possible that this is how good this algorithm can get at this particular training set. In most real problems, no algorithm can get to zero error.
As to why your error increases, I don't know how this particular algorithm works, but since it seems to rely on random methods, it seems reasonable behavior. It goes a bit up and down, but it does not steadily increase, so it doesn't seem problematic.
Upvotes: 2