Reputation: 162
The Computer Vision API offers at least two ways of obtaining tags for an image.
According to the documentation there's a list of tags for "description", as well as a top level "tags" list. The "tags" list additionally contains confidence-levels, the "description tags" list does not.
However, the description tag list seems to be much more exhaustive than the top level tags list, usually showing 5 to 10 times as many tags.
Why are there two different lists of tags and what's the use case for each list? Also, is there a reason why the description-tags list shows way more tags than the top-level tags list?
By intuition I'd be expecting the opposite, namely a description being based on a subset of tags of the main tags list
"description": {
"tags": [
"person",
"man",
"outdoor",
"window",
"glasses",
[...]
],
---------
"tags": [
{
"name": "person",
"confidence": 0.98979085683822632
},
{
"name": "man",
"confidence": 0.94493889808654785
}
Upvotes: 3
Views: 186
Reputation: 468
But why do the tags have confidence scores but the description.tags don't?
Also I noticed in one response that they had different fourth elements.
Upvotes: 0
Reputation: 2973
Although the list is similar today, there is no guarantee for now or for the future that these lists remain this way. The idea for the tags inside the description was to give API users an option to construct a sentence of their liking using these tags if the confidence for the natural-language sentence was low.
Note that in both the tags and description.tags cases, the terms are in descending confidence order.
Upvotes: 1