How to apply featuretools to output of featuretools?

I want to create complex features like [(a-b)/c or (a-b)/a]
This can be achieved by running feature tools multiple times so that first one creates features like a-b or a+b or a/b and then next run would create more complex features. As I try to do this using the following code samples:

import featuretools as ft

def multi_level_feature_creation(X, trans_primitives_per_level): 
    feature_matrix = X

    
    for i,trans_primitives in enumerate(trans_primitives_per_level):
        print("Level: ", i)
        print("Columns: ", feature_matrix.columns)
    
        es = ft.EntitySet(id = 'dataset')
    
        dataframe_name = "data" + str(i)
    
        es = es.add_dataframe(
            dataframe_name=dataframe_name,
            dataframe=feature_matrix,
            index="index" + str(i)
        )        

        feature_matrix, feature_defs = ft.dfs(entityset = es, target_dataframe_name = dataframe_name,
                                      trans_primitives = trans_primitives)
        
        
    return feature_matrix, feature_defs





X = df.drop(["target"], axis=1)
y = df["target"]


features_per_level = [ 
    ['add_numeric', 'multiply_numeric', 'subtract_numeric', 'divide_numeric', 'multiply_numeric_scalar'],
    ['add_numeric', 'multiply_numeric', 'subtract_numeric', 'divide_numeric', 'multiply_numeric_scalar'],
#         ['add_numeric', 'multiply_numeric', 'subtract_numeric', 'divide_numeric', 'multiply_numeric_scalar']


]



feature_matrix, feature_defs = multi_level_feature_creation(X, features_per_level)

print(type(feature_matrix))
feature_matrix.head()

When I run it with single level it works fine. The issue occurs when running on more than one levels:

ValueError: Cannot add a Woodwork DataFrame to EntitySet without a name

How to handle that?

Upvotes: 0

Answers (2)

Lan Si

Reputation: 100

I had the same question as yours.

Two methods: 1.Try to apply the answer here: deep feature synthesis depth for transformation primitives | featuretools basically:

create another ft.EntitySet() then use the feature_matrix from the previous step as input dataframe

Just use pandas to manipulate fm, probably easier.

Upvotes: 0

sbadithe

Reputation: 41

Thank you for your question.

It sounds like the desired goal is to create complex features. The desired features can be generated in a single run of dfs. Stacking TransformPrimitives on top of each other is not permitted in Featuretools. However, seed features can be used to generate the desired features. Click here for documentation.

Here is an example call to dfs:

es = ft.EntitySet(id="test") 
es = es.add_dataframe(dataframe = df, dataframe_name="df", index="idx", make_index=True)

a_minus_b = ft.TransformFeature([es['df'].ww['a'], es['df'].ww['b']], primitive=ft.primitives.SubtractNumeric)
a_minus_b_over_c = ft.TransformFeature([a_minus_b, es['df'].ww['c']], primitive=ft.primitives.DivideNumeric)
a_minus_b_over_a = ft.TransformFeature([a_minus_b, es['df'].ww['a']], primitive=ft.primitives.DivideNumeric) 

fm, fd = ft.dfs(entityset=es, target_dataframe_name="df", trans_primitives=trans_primitives, seed_features=[a_minus_b_over_c, a_minus_b_over_a])

Please let me know if this answers your question.

Upvotes: 1

How to apply featuretools to output of featuretools?

Answers (2)

Related Questions