Tensorflow: predicting a point from an image, training model with a point labels

Question

I want to create a model which can predict a point from an image. I have a dataset with training images. Those images are splitted between 24 dirs. I have prepared an json file containing a (x, y) values for every image.

example:

"dir22": {
        "frame_00001_rgb": {
            "x": 363.693829827852,
            "y": 278.2191728859505
        },
        "frame_00002_rgb": {
            "x": 330.9709780765119,
            "y": 283.34142472069004
        },
...
...
"dir23": {
        "frame_00001_rgb": {
            "x": 212.5232358000000,
            "y": 156.3342191728855
        },
        "frame_00002_rgb": {
            "x": 230.69497097807351,
            "y": 253.75341424720690
        },

My model looks like this:


img_width, img_height = 640, 480

train_data_dir = 'v_data/train'

epochs = 10
batch_size = 16

input_tensor = tf.keras.Input(shape=(img_width,img_height,3))
base_model = tf.keras.applications.ResNet50(weights='imagenet',include_top=False ,input_tensor=input_tensor)

top_model = tf.keras.Sequential()
top_model.add(tf.keras.Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(tf.keras.Dense(128, activation='relu'))
top_model.add(tf.keras.Dense(128, activation='relu'))
top_model.add(tf.keras.Dense(2))

model = tf.keras.Model(input= base_model.input, output= top_model(base_model.output))

for layer in model.layers[-15:]:
    layer.trainable = False


optimizer = tf.keras.optimizers.RMSprop(0.001)

model.compile(loss='mse',
            optimizer=optimizer,
            metrics=['mae', 'mse'])

now I have loaded images from my dir:

train_datagen = tf.keras.preprocessing.image.ImageDataGenerator()

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size)

Found 15678 images belonging to 24 classes.

now how can I assign a label for each image and train my model with it?

thushv89 · Accepted Answer

For this, you need to write a custom data generator.

Importing necessary libraries

import os
import pandas as pd
from skimage.io import imread # Used for image processing
from skimage.transform import resize # Used for image processing
import json
import numpy as np

Defining our own data generator

I followed this link to get an idea of how to do this. And customized it to your problem.

We need to fill in the following functions.

class DataGenerator(tf.keras.utils.Sequence):
    'Generates data for Keras'



    def __init__(self, directory, target_json, batch_size=32, target_size=(128, 128), shuffle=True):
        ...

    def __len__(self):
        'Denotes the number of batches per epoch'
        ...

    def __getitem__(self, index):
        'Generate one batch of data'
        ...

    def on_epoch_end(self):
        'Updates indexes after each epoch'
        ...

    def __data_generation(self, list_paths, list_paths_wo_ext):
        'Generates data containing batch_size samples' # X : (n_samples, *dim, n_channels)
        ...

Let's look at what variables we define

self.target_size = # Final size of the images
self.batch_size = # Batch size
self.target_json = # Path to the json file
self.directory = # Where the training data is

self.img_paths = # Contains image paths with extension
self.img_paths_wo_ext = # Contains the image paths without extension

self.targets = # The dataframe containing targets loaded from the json
self.shuffle = # Shuffle data at start of each epoch?

The JSON file

Your JSON file needs to be exactly in this format. This is probably what you have exactly too. But make sure it's 100% this format.

{'dir20': {'frame_00001_rgb': {'x': 363.693829827852, 'y': 278.2191728859505}, 'frame_00002_rgb': {'x': 330.9709780765119, 'y': 283.34142472069004}}, 'dir21': {'frame_00001_rgb': {'x': 363.693829827852, 'y': 278.2191728859505}, 'frame_00002_rgb': {'x': 330.9709780765119, 'y': 283.34142472069004}}, 'dir22': {'frame_00001_rgb': {'x': 363.693829827852, 'y': 278.2191728859505}, 'frame_00002_rgb': {'x': 330.9709780765119, 'y': 283.34142472069004}}, 'dir23': {'frame_00001_rgb': {'x': 363.693829827852, 'y': 278.2191728859505}, 'frame_00002_rgb': {'x': 330.9709780765119, 'y': 283.34142472069004}}, 'dir24': {'frame_00001_rgb': {'x': 212.5232358, 'y': 156.3342191728855}, 'frame_00002_rgb': {'x': 230.6949709780735, 'y': 253.7534142472069}}}

Next we need to convert this to a pandas dataframe. For that we define the following function. It's a bit complex due to the nature of your file. But here's what's happening.

Load the json and create a dataframe which contain columns like dir20.frame_00002_rgb.x.
Create a multi index by splitting the column to 3 levels (e.g. dir20, frame_00002, x)
Use stack to bring both dir* and frame_* as indices
Reformat the index so that it contains the full path for each image and each record has two columns (x and y).

def json_to_df(json_path, directory):
          with open(json_path,'r') as f:
            s = json.load(f)
          df = pd.io.json.json_normalize(s)
          ind = pd.MultiIndex.from_tuples([col.split('.') for col in df.columns])
          df.columns = ind
          df = df.stack(level=[0,1])
          df = df.set_index(df.index.droplevel(0))
          df = df.set_index(pd.Index([os.path.sep.join([directory]+list(c)) for c in df.index.values]))
          return df

Rest of the code

I won't go in to great details of what's going on in the other parts as it is quite straight forward. But we're essentially getting a single batch of data by reading the images, resizing and getting the correct x, y values from the dataframe we generated.

Full code

Here's the full code for the data generator.

class DataGenerator(tf.keras.utils.Sequence):
    'Generates data for Keras'



    def __init__(self, directory, target_json, batch_size=32, target_size=(128, 128), shuffle=True):
        'Initialization'
        self.target_size = target_size
        self.batch_size = batch_size
        self.target_json = target_json
        self.directory = directory

        self.img_paths = [] 
        self.img_paths_wo_ext = []      
        for root, dirs, files in os.walk(directory):
            for file in files:
                if file.lower().endswith(".jpg") or file.lower().endswith(".png"):
                    self.img_paths.append(os.path.join(root, file))
                    self.img_paths_wo_ext.append(os.path.splitext(os.path.join(root, file))[0])

        def json_to_df(json_path, directory):
          with open(json_path,'r') as f:
            s = json.load(f)
          df = pd.io.json.json_normalize(s)
          ind = pd.MultiIndex.from_tuples([col.split('.') for col in df.columns])
          df.columns = ind
          df = df.stack(level=[0,1])
          df = df.set_index(df.index.droplevel(0))
          df = df.set_index(pd.Index([os.path.sep.join([directory]+list(c)) for c in df.index.values]))
          return df

        self.targets = json_to_df(self.target_json, self.directory)
        self.shuffle = shuffle
        self.on_epoch_end()



    def __len__(self):
        'Denotes the number of batches per epoch'
        return int(np.floor(len(self.img_paths) / self.batch_size))

    def __getitem__(self, index):
        'Generate one batch of data'
        # Generate indexes of the batch
        indexes = self.indexes[index*self.batch_size:(index+1)*self.batch_size]

        # Find list of IDs
        list_paths = [self.img_paths[k] for k in indexes]
        list_paths_wo_ext = [self.img_paths_wo_ext[k] for k in indexes]
        # Generate data
        X, y = self.__data_generation(list_paths, list_paths_wo_ext)

        return X, y

    def on_epoch_end(self):
        'Updates indexes after each epoch'
        self.indexes = np.arange(len(self.img_paths))
        if self.shuffle == True:
            np.random.shuffle(self.indexes)

    def __data_generation(self, list_paths, list_paths_wo_ext):
        'Generates data containing batch_size samples' # X : (n_samples, *dim, n_channels)
        # Initialization
        X = np.empty((self.batch_size, *self.target_size, 3))
        y = self.targets.loc[list_paths_wo_ext].values

        # Generate data
        for i, ID in enumerate(list_paths):
            # Store sample

            X[i,] = resize(imread(ID),self.target_size)


        return X, y

Using the datagenerator

Here's how you'd use the data generator.

train_datagen = iter(DataGenerator(train_data_dir, './train/data.json', batch_size=2))


x, y = next(train_datagen)
print(x)
print(y)

which gives,

[[0.01377145 0.01377145 0.01377145]
   [0.00242393 0.00242393 0.00242393]
   [0.         0.         0.        ]
   ...
   [0.0037837  0.0037837  0.0037837 ]
   [0.0037837  0.0037837  0.0037837 ]
   [0.0037837  0.0037837  0.0037837 ]]

  ...

  [[0.37398897 0.3372549  0.17647059]
   [0.38967525 0.35294118 0.19215686]
   [0.42889093 0.39215686 0.23137255]
   ...
   [0.72156863 0.62889093 0.33085172]
   [0.71372549 0.61176471 0.31764706]
   [0.70588235 0.59359681 0.30340074]]]]

[[363.69382983 278.21917289]
 [330.97097808 283.34142472]]