Reputation: 41
I have completed a labelling job in AWS ground truth and started working on the notebook template for object detection.
I have 2 manifests which has 293 labeled images for birds in a train and validation set like this:
{"source-ref":"s3://XXXXXXX/Train/Blackbird_1.JPG","Bird-Label-Train":{"workerId":XXXXXXXX,"imageSource":{"s3Uri":"s3://XXXXXXX/Train/Blackbird_1.JPG"},"boxesInfo":{"annotatedResult":{"boundingBoxes":[{"width":1612,"top":841,"label":"Blackbird","left":1276,"height":757}],"inputImageProperties":{"width":3872,"height":2592}}}},"Bird-Label-Train-metadata":{"type":"groundtruth/custom","job-name":"bird-label-train","human-annotated":"yes","creation-date":"2019-01-16T17:28:23+0000"}}
Below are the parameters I am using for the notebook instance:
training_params = \
{
"AlgorithmSpecification": {
"TrainingImage": training_image, # NB. This is one of the named constants defined in the first cell.
"TrainingInputMode": "Pipe"
},
"RoleArn": role,
"OutputDataConfig": {
"S3OutputPath": s3_output_path
},
"ResourceConfig": {
"InstanceCount": 1,
"InstanceType": "ml.p3.2xlarge",
"VolumeSizeInGB": 5
},
"TrainingJobName": job_name,
"HyperParameters": { # NB. These hyperparameters are at the user's discretion and are beyond the scope of this demo.
"base_network": "resnet-50",
"use_pretrained_model": "1",
"num_classes": "1",
"mini_batch_size": "16",
"epochs": "5",
"learning_rate": "0.001",
"lr_scheduler_step": "3,6",
"lr_scheduler_factor": "0.1",
"optimizer": "rmsprop",
"momentum": "0.9",
"weight_decay": "0.0005",
"overlap_threshold": "0.5",
"nms_threshold": "0.45",
"image_shape": "300",
"label_width": "350",
"num_training_samples": str(num_training_samples)
},
"StoppingCondition": {
"MaxRuntimeInSeconds": 86400
},
"InputDataConfig": [
{
"ChannelName": "train",
"DataSource": {
"S3DataSource": {
"S3DataType": "AugmentedManifestFile", # NB. Augmented Manifest
"S3Uri": s3_train_data_path,
"S3DataDistributionType": "FullyReplicated",
"AttributeNames": ["source-ref","Bird-Label-Train"] # NB. This must correspond to the JSON field names in your augmented manifest.
}
},
"ContentType": "image/jpeg",
"RecordWrapperType": "None",
"CompressionType": "None"
},
{
"ChannelName": "validation",
"DataSource": {
"S3DataSource": {
"S3DataType": "AugmentedManifestFile", # NB. Augmented Manifest
"S3Uri": s3_validation_data_path,
"S3DataDistributionType": "FullyReplicated",
"AttributeNames": ["source-ref","Bird-Label"] # NB. This must correspond to the JSON field names in your augmented manifest.
}
},
"ContentType": "image/jpeg",
"RecordWrapperType": "None",
"CompressionType": "None"
}
]
I would end up with this being printed after running my ml.p3.2xlarge instance:
InProgress Starting
InProgress Starting
InProgress Starting
InProgress Training
Failed Failed
Followed by this error message: 'ClientError: train channel is not specified.'
Does anyone have any thoughts for how I can get this running with no errors? Any help is much apreciated!
Successful run: Below is the paramaters that were used, along with the Augmented Manifest JSON Objects for a successful run.
training_params = \
{
"AlgorithmSpecification": {
"TrainingImage": training_image, # NB. This is one of the named constants defined in the first cell.
"TrainingInputMode": "Pipe"
},
"RoleArn": role,
"OutputDataConfig": {
"S3OutputPath": s3_output_path
},
"ResourceConfig": {
"InstanceCount": 1,
"InstanceType": "ml.p3.2xlarge",
"VolumeSizeInGB": 50
},
"TrainingJobName": job_name,
"HyperParameters": { # NB. These hyperparameters are at the user's discretion and are beyond the scope of this demo.
"base_network": "resnet-50",
"use_pretrained_model": "1",
"num_classes": "3",
"mini_batch_size": "1",
"epochs": "5",
"learning_rate": "0.001",
"lr_scheduler_step": "3,6",
"lr_scheduler_factor": "0.1",
"optimizer": "rmsprop",
"momentum": "0.9",
"weight_decay": "0.0005",
"overlap_threshold": "0.5",
"nms_threshold": "0.45",
"image_shape": "300",
"label_width": "350",
"num_training_samples": str(num_training_samples)
},
"StoppingCondition": {
"MaxRuntimeInSeconds": 86400
},
"InputDataConfig": [
{
"ChannelName": "train",
"DataSource": {
"S3DataSource": {
"S3DataType": "AugmentedManifestFile", # NB. Augmented Manifest
"S3Uri": s3_train_data_path,
"S3DataDistributionType": "FullyReplicated",
"AttributeNames": attribute_names # NB. This must correspond to the JSON field names in your **TRAIN** augmented manifest.
}
},
"ContentType": "application/x-recordio",
"RecordWrapperType": "RecordIO",
"CompressionType": "None"
},
{
"ChannelName": "validation",
"DataSource": {
"S3DataSource": {
"S3DataType": "AugmentedManifestFile", # NB. Augmented Manifest
"S3Uri": s3_validation_data_path,
"S3DataDistributionType": "FullyReplicated",
"AttributeNames": ["source-ref","ValidateBird"] # NB. This must correspond to the JSON field names in your **VALIDATION** augmented manifest.
}
},
"ContentType": "application/x-recordio",
"RecordWrapperType": "RecordIO",
"CompressionType": "None"
}
]
}
Training Augmented Manifest File generated during the running of the training job
Line 1
{"source-ref":"s3://XXXXX/Train/Blackbird_1.JPG","TrainBird":{"annotations":[{"class_id":0,"width":1613,"top":840,"height":766,"left":1293}],"image_size":[{"width":3872,"depth":3,"height":2592}]},"TrainBird-metadata":{"job-name":"labeling-job/trainbird","class-map":{"0":"Blackbird"},"human-annotated":"yes","objects":[{"confidence":0.09}],"creation-date":"2019-02-09T14:21:29.829003","type":"groundtruth/object-detection"}}
Line 2
{"source-ref":"s3://xxxxx/Train/Blackbird_2.JPG","TrainBird":{"annotations":[{"class_id":0,"width":897,"top":665,"height":1601,"left":1598}],"image_size":[{"width":3872,"depth":3,"height":2592}]},"TrainBird-metadata":{"job-name":"labeling-job/trainbird","class-map":{"0":"Blackbird"},"human-annotated":"yes","objects":[{"confidence":0.09}],"creation-date":"2019-02-09T14:22:34.502274","type":"groundtruth/object-detection"}}
Line 3
{"source-ref":"s3://XXXXX/Train/Blackbird_3.JPG","TrainBird":{"annotations":[{"class_id":0,"width":1040,"top":509,"height":1695,"left":1548}],"image_size":[{"width":3872,"depth":3,"height":2592}]},"TrainBird-metadata":{"job-name":"labeling-job/trainbird","class-map":{"0":"Blackbird"},"human-annotated":"yes","objects":[{"confidence":0.09}],"creation-date":"2019-02-09T14:20:26.660164","type":"groundtruth/object-detection"}}
I then unzip the model.tar file to get the following files:hyperparams.JSON, model_algo_1-0000.params and model_algo_1-symbol
hyperparams.JSON looks like this:
{"label_width": "350", "early_stopping_min_epochs": "10", "epochs": "5", "overlap_threshold": "0.5", "lr_scheduler_factor": "0.1", "_num_kv_servers": "auto", "weight_decay": "0.0005", "mini_batch_size": "1", "use_pretrained_model": "1", "freeze_layer_pattern": "", "lr_scheduler_step": "3,6", "early_stopping": "False", "early_stopping_patience": "5", "momentum": "0.9", "num_training_samples": "11", "optimizer": "rmsprop", "_tuning_objective_metric": "", "early_stopping_tolerance": "0.0", "learning_rate": "0.001", "kv_store": "device", "nms_threshold": "0.45", "num_classes": "1", "base_network": "resnet-50", "nms_topk": "400", "_kvstore": "device", "image_shape": "300"}
Upvotes: 3
Views: 1646
Reputation: 41
Thank you again for your help. All of which were valid in helping me get further. Having received a response on the AWS forum pages, I finally got it working.
I understood that my JSON was slightly different to the augmented manifest training guide. Having gone back to basics, I created another labelling job, but used the 'Bounding Box' type as opposed to the 'Custom - Bounding box template'. My output matched what was expected. This ran with no errors!
As my purpose was to have multiple labels, I was able to edit the files and mapping of my output manifests, which also worked!
i.e.
{"source-ref":"s3://xxxxx/Blackbird_15.JPG","ValidateBird":{"annotations":[{"class_id":0,"width":2023,"top":665,"height":1421,"left":1312}],"image_size":[{"width":3872,"depth":3,"height":2592}]},"ValidateBird-metadata":{"job-name":"labeling-job/validatebird","class-map":{"0":"Blackbird"},"human-annotated":"yes","objects":[{"confidence":0.09}],"creation-date":"2019-02-09T14:23:51.174131","type":"groundtruth/object-detection"}}
{"source-ref":"s3://xxxx/Pigeon_19.JPG","ValidateBird":{"annotations":[{"class_id":2,"width":784,"top":634,"height":1657,"left":1306}],"image_size":[{"width":3872,"depth":3,"height":2592}]},"ValidateBird-metadata":{"job-name":"labeling-job/validatebird","class-map":{"2":"Pigeon"},"human-annotated":"yes","objects":[{"confidence":0.09}],"creation-date":"2019-02-09T14:23:51.074809","type":"groundtruth/object-detection"}}
The original mapping was 0:'Bird' for all images through the labelling job.
Upvotes: 1
Reputation: 186
Unfortunately, pipe mode with AugmentedManifestFile
is not supported for the image/jpeg
content type. To be able to use this feature, you will need to specify RecordWrapperType
as RecordIO
and ContentType
as application/x-recordio
.
Upvotes: 3
Reputation: 1349
The 'AttributeNames' parameter need to be ['source-ref', 'your label here'] in both your train and validation channel
Upvotes: 1