Keras model predict output is an array with values between 0 and 1

Question

I'm building an autoencoder network for finding outliers in text.

I first built an numpy array with my input represented as ascii texts, but I can't get them back.

My input looks like this:

fab_shadow_black.9.png
fab_shadow_dark.9.png
fab_shadow_light.9.png
fastscroller_handle_normal.xml
fastscroller_handle_pressed.xml
folder_fab.png
ic_account_circle_grey_24dp.xml
ic_action_cancel_light.png

My whole code is as follows:

import sys
from keras import Input, Model
import matplotlib.pyplot as plt
from keras.layers import Dense
import numpy as np
from pprint import pprint
from google.colab import drive

drive.mount('/content/drive')
with open('/content/drive/My Drive/Colab Notebooks/drawables.txt', 'r') as arquivo:
    dados = arquivo.read().splitlines()

def tamanho_maior_elemento(lista):
  maior = 0
  for elemento in lista:
    tamanho_elemento = len(elemento)
    if tamanho_elemento > maior:
      maior = tamanho_elemento
  return maior

def texto_para_ascii(lista, tamanho_maior_elemento):
  lista_ascii = list()
  for elemento in lista:
    elemento_ascii_lista = list()
    elemento_com_zeros = elemento.ljust(tamanho_maior_elemento, "0")
    for caractere in elemento_com_zeros:
      elemento_ascii_lista.append(ord(caractere))
    lista_ascii.append(elemento_ascii_lista)
  return lista_ascii

def ascii_para_texto(lista):
  lista_ascii = list()
  for elemento in lista:
    elemento_ascii_lista = list()
    for caractere in elemento:
      elemento_ascii_lista.append(chr(caractere))
    elemento_ascii_string = "".join(elemento_ascii_lista)
    lista_ascii.append(elemento_ascii_string)
  return lista_ascii

tamanho_maior_elemento = tamanho_maior_elemento(dados)

tamanho_lista = len(dados)

dados_ascii = texto_para_ascii(dados, tamanho_maior_elemento)

np_dados_ascii = np.array(dados_ascii)

tamanho_comprimido = int(tamanho/5)

dados_input = Input(shape=(tamanho_maior_elemento,))

hidden = Dense(tamanho_comprimido, activation='relu')(dados_input)

output = Dense(tamanho_maior_elemento, activation='relu')(hidden)
resultado = Dense(tamanho_maior_elemento, activation='sigmoid')(output)

autoencoder = Model(input=dados_input, output=resultado)
autoencoder.compile(optimizer='adam', loss='mse')
history = autoencoder.fit(np_dados_ascii, np_dados_ascii, epochs=10)

plt.plot(history.history["loss"])
plt.ylabel("Loss")
plt.xlabel("Epoch")
plt.show()

saida_predict = autoencoder.predict(np_dados_ascii)

saida_lista = saida_predict.tolist()

pprint(saida_predict)
pprint(saida_lista)

My input is a numpy array with each string represented as ascii number right-padded by zeroes.

The problem is that the output from predict is a lot of values between zero and one that I can't convert back to text.

array([[1.        , 0.9999999 , 1.        , ..., 1.        , 1.        ,
        1.        ],
       [0.99992466, 1.        , 1.        , ..., 1.        , 1.        ,
        1.        ],
       [1.        , 0.99999994, 1.        , ..., 1.        , 1.        ,
        1.        ],
       ...,
       [0.9999998 , 0.9999999 , 1.        , ..., 1.        , 1.        ,
        0.9999999 ],
       [1.        , 0.9999998 , 1.        , ..., 1.        , 1.        ,
        1.        ],
       [0.9999999 , 0.99999994, 1.        , ..., 1.        , 1.        ,
        1.        ]], dtype=float32)

I should be getting an array containing the ascii numbers just like I put in the input, what am I getting wrong?

Krunal V · Accepted Answer

In your code,

resultado = Dense(tamanho_maior_elemento, activation='sigmoid')(output)

You have used sigmoid activation that's why you have prediction in range 0 to 1. Try to change it with linear activation.

resultado = Dense(tamanho_maior_elemento)(output)

And for linear activation, you have no need to assign anything in activation because here, it mentioned that default it is linear activation.

Keras model predict output is an array with values between 0 and 1

Answers (1)

Related Questions