Reputation: 11
I have a question about make_classification from scikit-learn. I have created a dataset with make_classification in order to test how well different models can distinguish important features from less important features.
So I want to set the features in make_classification accordingly. This means, I would like to know upfront which are more important features and less important features. I would also like to set or adjust which are more important features, if possible. I have set the following:
X,y = make_classification(n_samples=50000, n_features=10, n_informative=5,
n_redundant=2, n_repeated=0, n_classes=2, n_clusters_per_class=2,
class_sep=1,
flip_y=0.01, weights=[0.9,0.1], shuffle=True, random_state=42)
In the documentation from make_classification there is information about weight and scale, but that doesn't seem right for knowing or shaping the importance of features.
My question is not about how to determine feature importance when using a specific model or different models.
My questions are:
Follow-up question:
Thank you, any ideas or advice are highly appreciated.
Upvotes: 0
Views: 27