newToML
newToML

Reputation: 11

correlation coefficients or feature importances from classification or regression algorithm model

I have created my sample data for machine learning just to checkout how classification and regression models work.

My sample data has 50 rows with columns for Memory, CPU, Responsetime. I have generated Responsetime using a formula Memory*2 + CPU*0.7.

Now when I use this data to generate models for classification using different algorithms like DecisionTree, RandomForest, SVM, NaiveBayes, SGD, LogisticRegression, I get back kappa and correlation coefficients (model.coef_) from the model and feature importances in case of decision tree, random forest.

The coefficient values returned for Memory and CPU are no where near to my formula that I used to generate these values of response time. I am not able to understand whether my models generated are right to use for prediction in this case or not.

For regression, Linear Regression did give me right coefficients matching with my formula.

Upvotes: 0

Views: 907

Answers (1)

Brian Ecker
Brian Ecker

Reputation: 2077

You gave a linear formula: (Memory*2 + CPU*0.7) and linear regression, a method that learns the B_j values in y_i = B_0*1 + B_1*X_i_1 + ... + B_n*X_i_n, was able to model that with the coefficients you would expect. That's because the form of the linear regression model matches the form of your equation, so it makes sense to match the coefficients directly.

For your classification algorithms, not only does the form of the equation not match your linear equation, but the problem is also not really a classification problem. You have given an example that is distinctly a regression problem.

Upvotes: 1

Related Questions