Tom
Tom

Reputation: 1063

Keras Sequential: advantage of creating Tensorflow numeric_columns?

I am learning about creating neural networks using Keras and running through various tutorials. In one, the model is built using a series of tf.feature_column.numeric_column and passing that to the Keras Sequential model (in this example feat_cols are the feature columns):

feature_columns = {c: tf.feature_column.numeric_column(c) for c in feat_cols}
model = Sequential([DenseFeatures(feature_columns=feature_columns.values()),
                    Dense(units=32, activation='relu',
                    Dense(units=8, activation='relu'),
                    Dense(units=1, activation='linear'])

In another tutorial, the initial input layer is just taken right from a pandas dataframe converted into a numpy array by using .values. The dictionary of tensors is never created, and the first layer doesn't have the DenseFeatures bit. (In this case df is the dataframe, features is a list of feature columns and lbl is the target column)

x = df[features].values
y = df[lbl].values

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.05)

model = Sequential([
            Dense(10, input_shape=(6, ), activation='relu'),
            Dense(20, activation='relu'),
            Dense(5, activation='relu'),
            Dense(1)])

In this case when model.fit is called, just x_train and y_train are passed instead of the tensor dict in the first example.

My question is what is the advantage or disadvantage (if any) of these two approaches? Are they two ways of getting to the same place or is there an actual difference?

Upvotes: 1

Views: 112

Answers (1)

Jason Chia
Jason Chia

Reputation: 1145

Note that the sequential nets are definitely not equivalent. But if you consider only the input components, they would be essentially the same. Both are valid ways to pass your data into the net. However in practice, dataframes are more common data sources, and tensors are slightly easier to handle with Tensorflow. With the keras API however there should be no performance difference. See tensorflow.org/tutorials/load_data/pandas_dataframe

Upvotes: 1

Related Questions