Reputation: 373
I imagine this is a broadly applicable question, but I'm trying to create a dataset for a particular competition that involves flying a UAV over a field with cardboard geometric shapes with alphanumeric characters painted on. The objective is to detect and classify the shapes and characters.
Currently, I'm using SURF to detect the shape, K-means to segment the shape and character, and a convolutional neural network to classify each. However, I'm experiencing a bottleneck when it comes to training data that can perform well with real data.
What I've Tried
Generating a dataset with Keras' ImageDataGenerator with random rotations, scalings, and skewings of a template image of each of the alphanumeric characters of a typewritten font and geometric shapes: works fine with data from the dataset (go figure) and some outside data but gets confused when the characters are too deviant
Using the MNIST dataset: no complaints, but only contains numbers
Using the EMNIST ByClass dataset (which is different from the MNIST dataset; contains letters as well): doesn't train easily because of size, and doesn't perform well even when trained to a decently high accuracy. In the dataset itself, many images bear little resemblance to the purported class, and some classes are at different rotations than others
Using Tesseract OCR for the characters. This hasn't had great results
What I Haven't Tried
Doing several flyovers with real cardboard cutouts that we create and using several frames from each video for the dataset. Cons: this would require quite a lot of flights and cardboard cutouts and wouldn't offer much data variation.
Using the ImageDataGenerator, but on several different fonts instead of one.
Does anyone have any advice on how to create a custom dataset for a task like this?
Upvotes: 3
Views: 1701
Reputation: 373
Something we learned was when generating a custom dataset, one should try to incorporate as many "real" elements (eg handwritten characters from EMNIST, backgrounds from Google Images) as possible. Data augmentation techniques, like using Keras' ImageDataGenerator class, are especially important if a part of the dataset needs to be generated.
We ended up using the EMNIST Balanced dataset and saw good results with this for alphanumeric classification. For localization of the geometric shape, we used the YOLO (https://pjreddie.com/darknet/yolo/) deep learning algorithm instead of SURF. To create a custom dataset, we placed generated geometric shapes on background images of aerial views of fields scraped from Google after placing EMNIST characters onto the geometric shapes.
Upvotes: 0
Reputation: 1063
this is my dataSetGenerator maybe help you to generate your own dataset
import numpy as np
from os import listdir
from glob import glob
import cv2
def dataSetGenerator(path,resize=False,resize_to=224,percentage=100):
"""
DataSetsFolder
|
|----------class-1
| . |-------image-1
| . | .
| . | .
| . | .
| . |-------image-n
| .
|-------class-n
:param path: <path>/DataSetsFolder
:param resize:
:param resize_to:
:param percentage:
:return: images, labels, classes
"""
classes = listdir(path)
image_list = []
labels = []
for classe in classes:
for filename in glob(path+'/'+classe+'/*.tif'):
if resize:image_list.append(cv2.resize(cv2.imread(filename),(resize_to, resize_to)))
else:image_list.append(cv2.imread(filename))
label=np.zeros(len(classes))
label[classes.index(classe)]=1
labels.append(label)
indice = np.random.permutation(len(image_list))[:int(len(image_list)*percentage/100)]
return np.array([image_list[x] for x in indice]),np.array([labels[x] for x in indice]),np.array(classes)
Upvotes: 4