Damb
Damb

Reputation: 14600

Training Microsoft Custom Vision model via rest api

I am working on a simple nodejs console utility that will upload images for the training of a Custom Vision model. I do this mainly because the customvision web app won't let you tag multiple images at once.

tl;dr: How to post images into the CreateImagesFromFiles API endpoint?

I cannot figure out how to pass images that I want to upload. The documentation just defines a string as a type for one of the properties (content I guess). I tried passing path to local file, url to online file and even base64 encoded image as a string. Nothing passed.

They got a testing console (blue button "Open API testing console" at the linked docs page) but once again... it's vague and won't tell you what kind of data it actually expects.

The code here isn't that relevant, but maybe it helps...

const options = {
    host: 'southcentralus.api.cognitive.microsoft.com',
    path: `/customvision/v2.0/Training/projects/${projectId}/images/files`,
    method: 'POST',
    headers: {
        'Training-Key': trainingKey,
        'Content-Type': 'application/json'
    }
};

const data = {
    images: [
        {
            name: 'xxx',
            contents: 'iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAEklEQVR42mP8z8AARKiAkQaCAFxlCfyG/gCwAAAAAElFTkSuQmCC',
            tagIds: [],
            regions: []
        }
    ],
    tagIds: []
}

const req = http.request(options, res => {
  ...
})
req.write(JSON.stringify(data));
req.end();

Response:

BODY: { "statusCode": 404, "message": "Resource not found" }
No more data in response.

Upvotes: 3

Views: 887

Answers (1)

Nicolas R
Nicolas R

Reputation: 14619

I got it working using the "API testing console" feature, so I can help you to identify your issue (but sorry, I'm not expert in node.js so I will guide you with C# code)

Format of content for API

You are right, the documentation is not clear about the content the API is waiting for. I made some search and found a project in a Microsoft's Github repository called Cognitive-CustomVision-Windows, here.

What is saw is that they use a class called ImageFileCreateEntry whose signature is visible here:

public ImageFileCreateEntry(string name = default(string), byte[] contents = default(byte[]), IList<System.Guid> tagIds = default(IList<System.Guid>))

So I guessed it's using a byte[].

You can also see in their sample how they did for this "batch" mode:

// Or uploaded in a single batch 
var imageFiles = japaneseCherryImages.Select(img => new ImageFileCreateEntry(Path.GetFileName(img), File.ReadAllBytes(img))).ToList();
trainingApi.CreateImagesFromFiles(project.Id, new ImageFileCreateBatch(imageFiles, new List<Guid>() { japaneseCherryTag.Id }));

Then this byte array is serialized with Newtonsoft.Json: if you look at their documentation (here) it says that byte[] are converted to String (base 64 encoded). That's our target.

Implementation

As you mentioned that you tried with base64 encoded image, I gave it a try to check. I took my StackOverflow profile picture that I downloaded locally. Then using the following, I got the base64 encoded string:

Image img = Image.FromFile(@"\\Mac\Home\Downloads\Picto.jpg");
byte[] arr;
using (MemoryStream ms = new MemoryStream())
{
    img.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg);
    arr = ms.ToArray();
}

var content = Convert.ToBase64String(arr);

Later on, I called the API with no tags to ensure that the image is posted and visible:

POST https://southcentralus.api.cognitive.microsoft.com/customvision/v2.2/Training/projects/MY_PROJECT_ID/images/files HTTP/1.1
Host: southcentralus.api.cognitive.microsoft.com
Training-Key: MY_OWN_TRAINING_KEY
Content-Type: application/json

{
  "images": [
    {
      "name": "imageSentByApi",
      "contents": "/9j/4AAQSkZJRgA...TOO LONG FOR STACK OVERFLOW...",
      "tagIds": [],
      "regions": []
    }
  ],
  "tagIds": []
}

Response received: 200 OK

{
  "isBatchSuccessful": true,
  "images": [{
    "sourceUrl": "imageSentByApi",
    "status": "OK",
    "image": {
      "id": "GENERATED_ID_OF_IMAGE",
      "created": "2018-11-05T22:33:31.6513607",
      "width": 328,
      "height": 328,
      "resizedImageUri": "https://irisscuprodstore.blob.core.windows.net/...",
      "thumbnailUri": "https://irisscuprodstore.blob.core.windows.net/...",
      "originalImageUri": "https://irisscuprodstore.blob.core.windows.net/..."
    }
  }]
}

And my image is here in Custom Vision portal!

image in custom vision

Debugging your code

In order to debug, you should 1st try to submit your content again with tagIds and regions arrays empty like in my test, then provide the content of the API reply

Upvotes: 3

Related Questions