Graphics Engineer
Graphics Engineer

Reputation: 105

What is the proper way of using featuretools for single table data?

Assume that I have a dataset consisting of single table, for instance you can consider titanic dataset on kaggle.

Now what is a proper way of using feature tools to get most benefit from it? as featuretools is specially for relational data.

now by 'proper' I mean, I know that when creating entityset the index parameter will be just index of the dataset but what should be my new index when normalizing the entity? also is it okay to use RFE blindly for feature selection?

Upvotes: 1

Views: 282

Answers (1)

Jeff Hernandez
Jeff Hernandez

Reputation: 2123

You can get the most benefit from Featuretools by normalizing the entity set. The more normalized an entity set can be, the greater DFS can leverage the relational structure to generate better features.

The objective of the normalization process is to eliminate redundant data. So, the new index with additional variables should be one that helps towards this objective. This guide goes into more depth on creating an entity from a de-normalized table.

For feature selection, I think RFE can be used judiciously with the objectives to improve the accuracy and reduce the complexity of a model.

Upvotes: 2

Related Questions