NeoZoom.lua
NeoZoom.lua

Reputation: 2921

TensorFlow 2 Panda Tutorial Preprocessing Confusion

I'm following one of the TensorFlow's tutorial and I don't understand the purpose of the following code:

numeric_inputs = {name:input for name,input in inputs.items()
                  if input.dtype==tf.float32}

x = layers.Concatenate()(list(numeric_inputs.values()))
norm = layers.Normalization()
norm.adapt(np.array(titanic[numeric_inputs.keys()]))
all_numeric_inputs = norm(x)

all_numeric_inputs

This is my current attempt:

  1. numeric_inputs is a map of filtered results
  2. I don't understand why use layers.Concatenate() here. I have read the docs.
  3. I understand that norm.adapt() need to be called before using norm as a (starting) layer of the Sequential model (from the section above). But why call it with x here, i.e. norm(x)?

Btw, any advice regarding learning with official tutorials? I found that some of them are still too vague for me.

Upvotes: 1

Views: 65

Answers (1)

elbe
elbe

Reputation: 1508

The part of code you provide is in the Mixed data types section of the tutorial.
To answer your question, here is some of the code from the tutorial:

numeric_inputs = {name:input for name,input in inputs.items()
                  if input.dtype==tf.float32}

x = layers.Concatenate()(list(numeric_inputs.values()))
norm = layers.Normalization()
norm.adapt(np.array(titanic[numeric_inputs.keys()]))
all_numeric_inputs = norm(x)

all_numeric_inputs

where titanic is a pandas dataframe:

titanic = pd.read_csv("https://storage.googleapis.com/tf-datasets/titanic/train.csv")
titanic_features = titanic.copy()

inputs = {}

for name, column in titanic_features.items():
  dtype = column.dtype
  if dtype == object:
    dtype = tf.string
  else:
    dtype = tf.float32

  inputs[name] = tf.keras.Input(shape=(1,), name=name, dtype=dtype)

Back to your questions:

  1. numeric_inputs is a filtered dictionary in which only the numeric symbolic tensors (of type float32) are kept,
  2. the Concatenate layer allows to obtain a single tensor from the filtered dictionary (which is first converted into a list),
  3. As the functional API will be used to define the Model, the data is normalized with the Normalization layer (instantiated as norm here) which is adapted/trained on some data. The normalization layer takes as input the output of the previous layer x.

Upvotes: 1

Related Questions