Reputation: 280
The TensorFlow DCGAN tutorial code for the generator and discriminator models is intended for 28x28 pixel black-and-white images (MNIST dataset).
I would like adapt that model code to work with my own dataset of 280x280 RGB images (280, 280, 3), but it's not clear how to do that.
Upvotes: 4
Views: 1137
Reputation: 7745
You can use the code in the tutorial fine, you just need to adapt the generator a bit. Let me break it for you. Here is the generator code from the tutorial:
def make_generator_model():
model = tf.keras.Sequential()
model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Reshape((7, 7, 256)))
assert model.output_shape == (None, 7, 7, 256) # Note: None is the batch size
model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
assert model.output_shape == (None, 7, 7, 128)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 14, 14, 64)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
assert model.output_shape == (None, 28, 28, 1)
return model
The generator takes 100 samples from the prior distribution (noise) as you can see from the input_shape
. Then, it projects the data into a bigger dimension of 7 * 7 * 256 and reshape it to have feature maps of shape (7, 7, 256). Now the idea is by the end of the model, we want to decrease channels to 1 and increase the width and height to reach the original image size. The channels are controlled by the number of filters and that is why it is decreasing each consecutive Conv2DTranspose
layer. it goes from 256 to 128, 64 and 1. For the width and height, they are controlled by strides
parameter. With that, the first Conv2DTranspose
doesn't change the width and height because it has strides of 1, however the second will multiple by 2, which yields to (14, 14) and again with the last Conv2DTranspose
, which yields to (28, 28).
For your case, you have two options: Either increase the first hidden layer (Dense) to project the data to (70 * 70 * 256) and in the final Conv2DTranspose
, change the filters to 3 and keep everything as it is, the final output would be (280, 280, 3). This would be like the following:
def make_generator_model():
model = tf.keras.Sequential()
model.add(layers.Dense(70*70*256, use_bias=False, input_shape=(100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Reshape((70, 70, 256)))
assert model.output_shape == (None, 70, 70, 256) # Note: None is the batch size
model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
assert model.output_shape == (None, 70, 70, 128)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 140, 140, 64)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(3, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
assert model.output_shape == (None, 280, 280, 3)
return model
I don't recommend this approach because you are applying very big projections with few steps. The second approach is to increase the number of Conv2DTranspose
layers gradually until you reach the correct dimension. For example, start with (35 * 35 * 512) and add one additional Conv2DTranspose
with strides equal (2, 2) the filters will got from 512, 256, 128, 64, 3.
Regarding The discriminator, it will work fine without modification. However, I would add more Conv2D
to the discriminator and make it deeper as your image is quite big.
Upvotes: 3