Reputation: 4577
Are the feature vectors generated by featuretools/DFS dense or sparse or does it depend on something?
Upvotes: 2
Views: 374
Reputation: 404
The sparseness of feature vectors generated by Featuretools will in general be dependent on
EntitySet
in question andPrimitives are meant to give back dense information. While it's possible (but not helpful) to construct example EntitySets
that will make the output of primitive sparse, it's more common for the primitive to give back no information than sparse information.
However, certain primitives and workflows are more likely to give back sparse than others. A big one to worry about is feature encoding, which uses one-hot. Because that's generating a vector with 1s only when a certain value occurs, an infrequently occurring categorical value immediately would be converted into a sparse vector. Using Where
aggregation primitives can sometimes have similar results.
Upvotes: 3