TGorlenko
TGorlenko

Reputation: 11

encoding and decoding pictures pytorch

Task: Using the example of the "fetch_lfw_people" dataset to write and train an autocoder. Write an iteration code by epoch. Write code to visualize the learning process and count the metrics for validation after each epoch. Train auto encoder. Achieve low loss on validation.

My code:

from sklearn.datasets import fetch_lfw_people
import numpy as np
import torch
from torch.utils.data import TensorDataset, DataLoader
from sklearn.model_selection import train_test_split

Data preparation:

lfw_people = fetch_lfw_people(min_faces_per_person=70, resize=0.4)    
X = lfw_people['images']

X_train, X_test = train_test_split(X, test_size=0.1)

X_train = torch.tensor(X_train, dtype=torch.float32, requires_grad=True)
X_test = torch.tensor(X_test, dtype=torch.float32, requires_grad=False)
dataset_train = TensorDataset(X_train, torch.zeros(len(X_train)))
dataset_test = TensorDataset(X_test, torch.zeros(len(X_test)))

batch_size = 32

train_loader = DataLoader(dataset_train, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset_test, batch_size=batch_size, shuffle=False)

Сreate a network with encoding and decoding functions:

class Autoencoder(torch.nn.Module): 
    def __init__(self): 
        super(Autoencoder, self).__init__()
        self.encoder = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, stride=2), 
            torch.nn.ReLU(),

            torch.nn.Conv2d(in_channels=32, out_channels=64, stride=2, kernel_size=3),
            torch.nn.ReLU(),

            torch.nn.Conv2d(in_channels=64, out_channels=64, stride=2, kernel_size=3),
            torch.nn.ReLU(),

            torch.nn.Conv2d(in_channels=64, out_channels=64, stride=2, kernel_size=3)
        )

        self.decoder = torch.nn.Sequential( 
            torch.nn.ConvTranspose2d(in_channels=64, out_channels=64, kernel_size=3, stride=2),

            torch.nn.ConvTranspose2d(in_channels=64, out_channels=64, kernel_size=(3,4), stride=2),

            torch.nn.ConvTranspose2d(in_channels=64, out_channels=32, kernel_size=4, stride=2),          

            torch.nn.ConvTranspose2d(in_channels=32, out_channels=1, kernel_size=(4,3), stride=2)
        )

    def encode(self, X):
        encoded_X = self.encoder(X) 
        batch_size = X.shape[0] 
        return encoded_X.reshape(batch_size, -1)

    def decode(self, X): 
        pre_decoder = X.reshape(-1, 64, 2, 1)  
        return self.decoder(pre_decoder)

I check the work of the model before learning by one example:

model = Autoencoder()

sample = X_test[:1]
sample = sample[:, None] 
result = model.decode(model.encode(sample))  # before train

fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2)
ax1.imshow(sample[0][0].detach().numpy(), cmap=plt.cm.Greys_r)
ax2.imshow(result[0][0].detach().numpy(), cmap=plt.cm.Greys_r)
plt.show()

The result is unsatisfactory. I start training:

model = Autoencoder()
loss = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

history_train = []
history_test = []

for i in range(5):
    for x, y in train_loader:
        x = x[:, None]

        model.train()

        decoded_x = model.decode(model.encode(x))
        mse_loss = loss(torch.tensor(decoded_x, dtype=torch.float), x)

        optimizer.zero_grad()
        mse_loss.backward()
        optimizer.step()

        history_train.append(mse_loss.detach().numpy())

    model.eval()
    with torch.no_grad():
        for x, y in train_loader:
            x = x[:, None]

            result_x = model.decode(model.encode(x))
            loss_test = loss(torch.tensor(result_x, dtype=torch.float), x)

            history_test.append(loss_test.detach().numpy())

plt.subplot(1, 2, 1)
plt.plot(history_train)
plt.title("Optimization process for train data")

plt.subplot(1, 2, 2)
plt.plot(history_test)
plt.title("Loss for test data")

plt.show

A huge loss on the training data and on the test.

Аfter training nothing has changed:

with torch.no_grad():
    model.eval()
    res1 = model.decode(model.encode(sample))

fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2)
ax1.imshow(sample[0][0].detach().numpy(), cmap=plt.cm.Greys_r)
ax2.imshow(res1[0][0].detach().numpy(), cmap=plt.cm.Greys_r)
plt.show()

Why such a big loss? Reducing the input to the interval [-1, 1] does not help. I did it like this: (value / 255) * 2 - 1 Why do not change the parameters of the model after training? Why does not change the decoded sample?

Result: before train, after train, loss https://i.sstatic.net/OhdrJ.jpg

Upvotes: 1

Views: 3363

Answers (1)

TGorlenko
TGorlenko

Reputation: 11

1) replace line

mse_loss = loss(torch.tensor(decoded_x, dtype=torch.float), x)

with line

mse_loss = loss(decoded_x, x)

2) replace lines

model.eval()
    with torch.no_grad():
        for x, y in train_loader:

with lines

replace lines

model.eval()
    with torch.no_grad():
        for x, y in test_loader:

Upvotes: 0

Related Questions