Reputation: 91
Quick disclaimer: I'm pretty new to Keras, machine learning, and programming in general.
I'm trying to create a basic autoencoder for (currently) a single image. While it seems to run just fine, the output is just a white image. Here's what I've got:
img_height, img_width = 128, 128
input_img = '4.jpg'
output_img = '5.jpg'
# load image
x = load_img(input_img)
x = img_to_array(x) # array with shape (128, 128, 3)
x = x.reshape((1,) + x.shape) # array with shape (1, 128, 128, 3)
# define input shape
input_shape = (img_height, img_width, 3)
model = Sequential()
# encoding
model.add(Conv2D(128, (3, 3), activation='relu', input_shape=input_shape,
padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
# decoding
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(UpSampling2D(size=(2,2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(UpSampling2D(size=(2,2)))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(3, (3, 3), activation='sigmoid', padding='same'))
model.compile(loss='binary_crossentropy', optimizer='adam')
print(model.summary())
checkpoint = ModelCheckpoint("autoencoder-loss-{loss:.4f}.hdf5", monitor='loss', verbose=0, save_best_only=True, mode='min')
model.fit(x, x, epochs=10, batch_size=1, verbose=1, callbacks=[checkpoint])
y = model.predict(x)
y = y[0, :, :, :]
y = array_to_img(y)
save_img(output_img, y)
I've looked at a handful of tutorials for reference, but I still can't figure out what my issue is.
Any guidance/suggestions/help would be greatly appreciated.
Thanks!
Upvotes: 1
Views: 1385
Reputation: 195
this solved the problem. The code was just missing
x = x.astype('float32') / 255.
This is a numpy built-in function to convert the values contained in that vector to floats.
This allows us to get decimal values, where the values are divided by 255. RGB values are stored as 8 bit integers, so we divide the values in the vector by 255 (2^8 - 1), to represent the colour as a decimal value between 0.0 and 1.0.
Upvotes: 2