AweSIM
AweSIM

Reputation: 1703

How can I improve the speed of my simple neural network?

I've just started exploring TensorFlow and I'm facing an issue regarding performance. As a starting example, I tried implementing a model to simulate a logic gate. Let's say there are two inputs A and B and one output Y. Suppose Y depended only on B and not on A. That means that the following are valid examples:

[0, 0] -> 0
[1, 0] -> 0
[0, 1] -> 1
[1, 1] -> 1

I created training sets for this data and created a model that uses a DenseFeatures layer using two features A and B. This layer feeds into a Dense(128, 'relu') layer, which feeds into a Dense(16, 'relu') layer, which finally feeds into a Dense(1, 'sigmoid') layer.

Training this NN works fine and the predictions are perfect. However, I noticed that on my MacBook, each prediction takes about 250ms. This is too much, since my final goal is to use such a NN to test hundreds of predictions each second.

So I stripped the network down to DenseFeatures([A, B]) -> Dense(8, 'relu') -> Dense(1, 'sigmoid'), however predictions for this NN still takes the same about of time. I was expecting that the execution speed depends on the complexity of the model. I can see that this is not the case here? What am I doing wrong?

Also, I had read that TensorFlow uses floating point math for accuracy but this has a penalty hit in terms of performance and if we convert our data to use integer math, it would speed things up. However, I have no idea of how to achieve that.

I would really appreciate if someone can help me understand why predictions for such a simple logic gate and such a simple NN is taking this long. And how can I speed it up.

For reference, here is my code in python:

  import random
  from typing import Any, List

  import numpy as np
  import tensorflow as tf
  from sklearn.model_selection import train_test_split
  from tensorflow import feature_column
  from tensorflow.keras import layers

  class Input:
    def __init__(self, data: List[int]):
      self.data = data

  class Output:
    def __init__(self, value: float):
      self.value = value

  class Item:
    def __init__(self, input: Input, output: Output):
      self.input = input
      self.output = output

  DATA: List[Item] = []
  for i in range(10000):
    x = Input([random.randint(0, 1), random.randint(0, 1)])
    y = Output(x.data[1])
    DATA.append(Item(x, y))

  BATCH_SIZE = 5
  DATA_TRAIN, DATA_TEST = train_test_split(DATA,       shuffle=True, test_size=0.2)
  DATA_TRAIN, DATA_VAL  = train_test_split(DATA_TRAIN, shuffle=True, test_size=0.2/0.8)

  def toDataSet(data: List[Item], shuffle: bool, batch_size: int):
    a = {
      'a': [x.input.data[0] for x in data],
      'b': [x.input.data[1] for x in data],
    }
    b = [x.output.value for x in data]
    return tf.data.Dataset.from_tensor_slices((a, b)).shuffle(buffer_size=len(data)).batch(BATCH_SIZE)

  DS_TRAIN = toDataSet(DATA_TRAIN, True, 5)
  DS_VAL   = toDataSet(DATA_VAL, True, 5)
  DS_TEST  = toDataSet(DATA_TEST, True, 5)

  FEATURES = []
  FEATURES.append(a)
  FEATURES.append(b)
  feature_layer = tf.keras.layers.DenseFeatures(FEATURES)
  model = tf.keras.models.load_model('MODEL.H5')
  model = tf.keras.Sequential([
    feature_layer,
    layers.Dense(8, activation='relu'),
    layers.Dense(1, activation='sigmoid')
  ])
  model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
  model.fit(DS_TRAIN, validation_data=DS_VAL, epochs=10)
  loss, accuracy = model.evaluate(DS_TEST)

  for i in range(1000):
    val = model.predict([np.array([random.randint(0, 1)]), np.array([random.randint(0, 1)])])

Upvotes: 2

Views: 860

Answers (1)

deepdreams
deepdreams

Reputation: 399

Since you are only using integers, change the input of the model to use 8-bit signed integers. You can do this by changing the datatype in your input layer by adding the dtype parameter. This will vastly improve processing speed since you won't be wasting calculations.

Upvotes: 2

Related Questions