Rui Figueiredo
Rui Figueiredo

Reputation: 110

Is it possible to skip the processing of one column?

I want to keep one column of my dataframe in its original state, not applying any primitive to it, is it possible?

Upvotes: 1

Views: 392

Answers (1)

Max Kanter
Max Kanter

Reputation: 2014

Yes, you can do this with the ignore_variables parameter to ft.dfs. Here's an example on a demo entity set.

import featuretools as ft
es = ft.demo.load_mock_customer(return_entityset=True)
es.plot()

example entity set

if we want to build features for the sessions entity, but ignore the device variable, we can run

feature_defs = ft.dfs(target_entity="sessions",
                      entityset=es, 
                      agg_primitives=["count", "mode"],
                      trans_primitives=[],
                      ignore_variables={"sessions": ["device"]},
                      features_only=True)

feature_defs has the following features

[<Feature: customer_id>,
 <Feature: COUNT(transactions)>,
 <Feature: MODE(transactions.product_id)>,
 <Feature: customers.zip_code>,
 <Feature: MODE(transactions.products.brand)>,
 <Feature: customers.COUNT(sessions)>,
 <Feature: customers.COUNT(transactions)>,
 <Feature: customers.MODE(transactions.product_id)>]

this creates features using the count and mode primitives, but ignores the device variable in the sessions entity. if we want to include the device variable in its original state we can add it back in like this

feature_defs += [ft.Feature(es["sessions"]["device"])]

Now, we can calculate the feature matrix. device is now at the end

fm = ft.calculate_feature_matrix(features=feature_defs, entityset=es)
fm

            customer_id  COUNT(transactions)  MODE(transactions.product_id) customers.zip_code   ...    customers.COUNT(sessions)  customers.COUNT(transactions)  customers.MODE(transactions.product_id)   device
session_id                                                                                       ...                                                                                                              
1                     2                   16                              3              13244   ...                            7                             93                                        4  desktop
2                     5                   10                              5              60091   ...                            6                             79                                        5   mobile
3                     4                   15                              1              60091   ...                            8                            109                                        2   mobile
4                     1                   25                              5              60091   ...                            8                            126                                        4   mobile
5                     4                   11                              5              60091   ...                            8                            109                                        2   mobile
6                     1                   15                              4              60091   ...                            8                            126                                        4   tablet
7                     3                   15                              1              13244   ...                            6                             93                                        1   tablet
8                     4                   18                              1              60091   ...                            8                            109                                        2   tablet
9                     1                   15                              1              60091   ...                            8                            126                                        4  desktop
10                    2                   15                              2              13244   ...                            7                             93                                        4   tablet
11                    4                   15                              3              60091   ...                            8                            109                                        2   mobile
12                    4                   10                              4              60091   ...                            8                            109                                        2  desktop
13                    4                   12                              2              60091   ...                            8                            109                                        2   mobile
14                    1                   12                              4              60091   ...                            8                            126                                        4   tablet
15                    2                    8                              2              13244   ...                            7                             93                                        4  desktop
16                    2                   10                              4              13244   ...                            7                             93                                        4  desktop
17                    2                   13                              1              13244   ...                            7                             93                                        4   tablet
18                    1                   12                              2              60091   ...                            8                            126                                        4  desktop
19                    3                   17                              1              13244   ...                            6                             93                                        1  desktop
20                    5                   15                              1              60091   ...                            6                             79                                        5  desktop
21                    4                   18                              5              60091   ...                            8                            109                                        2  desktop
22                    4                   10                              2              60091   ...                            8                            109                                        2  desktop
23                    3                   11                              3              13244   ...                            6                             93                                        1  desktop
24                    5                   14                              4              60091   ...                            6                             79                                        5   tablet
25                    3                   16                              1              13244   ...                            6                             93                                        1  desktop
26                    1                   16                              1              60091   ...                            8                            126                                        4   tablet
27                    1                   15                              5              60091   ...                            8                            126                                        4   mobile
28                    5                   18                              2              60091   ...                            6                             79                                        5   mobile
29                    1                   16                              4              60091   ...                            8                            126                                        4   mobile
30                    5                   14                              3              60091   ...                            6                             79                                        5  desktop
31                    2                   18                              3              13244   ...                            7                             93                                        4   mobile
32                    5                    8                              3              60091   ...                            6                             79                                        5   mobile
33                    2                   13                              3              13244   ...                            7                             93                                        4   mobile
34                    3                   18                              4              13244   ...                            6                             93                                        1  desktop
35                    3                   16                              5              13244   ...                            6                             93                                        1   mobile

As a sanity check, this is what the output is if we don't use ignore_variables

feature_defs = ft.dfs(target_entity="sessions",
                      entityset=es, 
                      agg_primitives=["count", "mode"],
                      trans_primitives=[],
                      features_only=True)

you can see the feature <Feature: customers.MODE(sessions.device)> gets created now

[<Feature: customer_id>,
 <Feature: device>,
 <Feature: COUNT(transactions)>,
 <Feature: MODE(transactions.product_id)>,
 <Feature: customers.zip_code>,
 <Feature: MODE(transactions.products.brand)>,
 <Feature: customers.COUNT(sessions)>,
 <Feature: customers.MODE(sessions.device)>,
 <Feature: customers.COUNT(transactions)>,
 <Feature: customers.MODE(transactions.product_id)>]

Upvotes: 3

Related Questions