Reputation: 315
As a newbie in machine learning, I have tried to search for a lot of methods used for training classifier and how to make data readable. All I know till now that labels are the most important thing for classifier which is obvious. My question is, I have huge data set with more than 300000 images and labels in other folder explaining the side and bounding boxes for each image, I also have other information of data which is in other folder not in label folder but in misc folder having the .mat files and each .mat files contain the make or model name of the car in the image. Since, till now I'm converting image and label data in numpy format, and appending them together in training_data, what should I do with misc data so it can also be trained with labels and image data.
Your answers will be highly appreciated.
I am explaining about the folder more here below, so you all have more view about the data. I just need theoretical answers steps, if you guys can do it.
Data Explanation
Descriptions of the folders and files are as follows:
-image:
Stores all full car images in the path format 'make_id/model_id/released_year/image_name.jpg'.
-label:
Stores all labels to the full car images in the path format 'make_id/model_id/released_year/image_name.txt'. Each label file has three lines. The first line is a number which is the viewpoint annotation (-1 - uncertain, 1 - front, 2 - rear, 3 - side, 4 - front-side, 5 - rear-side). The second line is the number of the bounding boxes, which is all '1' in the current release. The third line is the coordinates of the bounding box in the format 'x1 y1 x2 y2' in pixels, where 1 <= x1 < x2 <= image_width, and 1 <= y1 < y2 <= image_height.
-misc:
-attributes.txt:
Each line is the attribute annotation for one model which is in the format 'model_id maximum_speed displacement door_number seat_number type'. For car types, a number from 1~12 corresponds to a specific type, which is described in 'car_type.mat'. Unavailable attributes are denoted by '0' or '0.0'.
-make_model_name.mat
Cell array 'make_names' provides the projections from 'make_id' to make names, and cell array 'model_names' provides the projections from 'model_id' to model names.
-part:
Stores all part images in the path format 'make_id/model_id/released_year/part_id/image_name.jpg'. The correspondance of 'part_id' and part names are: 1 - headlight, 2 - taillight, 3 - fog light, 4 - air intake, 5 - console, 6 - steering wheel, 7 - dashboard, and 8 - gear lever.
-train_test_split:
This folder generally provides all the train/test subsets used in the paper.
-classification
Stores the train/test lists for the classification task with full car images in the paper.
-part:
Stores the train/test lists for the classification task with car part in the paper.
-verification:
'verification_train.txt' is the image list for training the verification models which is also for testing attribute prediction. 'verification_pairs_easy.txt', 'verification_pairs_medium.txt', and 'verification_pairs_hard.txt' are the three sets with different difficulties for testing car verification models. Each line of 'verification_pairs_XXX.txt' is in the format of 'path_to_image_1 path_to_image_2 label' where label is '1' for positive pairs and is '0' for negative pairs.
Thank you.
Upvotes: 1
Views: 125
Reputation: 2171
What you could do is make the misc data numerical.
Let's say you have 3 different models of a car: Ferrari
, Tesla
and Lambo
. You can define that if the car is a Ferrari
, it's number is 0
. If it's Tesla
, it's number is 1
and if it's Lambo
, it's number is 2
. Now, when you load your labels, load also the misc folder data and do the swapping as defined above: Ferrari = 0, Tesla = 1, Lambo = 2
Now you can append that to the label vector and teach the network to predict 0
, 1
or 2
for a car model. When the network has made the prediction, you can assert if the prediction is true (ex. if the NN predicted 1
it means that the car in the image is Tesla
).
This method you can apply to any other non-numerical feature from misc folder. That is called embedding
- transferring non-numerical values to numerical.
Upvotes: 1