Reputation: 63
As part of my dissertation for my BSc degree, I have to utilize image recognition on a video feed.
I have identified openCV and Tensorflow - specifically the Inception trained model - as two options, however I wouldn't know how to go on from there. Basically what I'll need is to pass a string such as "keys" and if one of the top 5 outcomes is "keys" get a boolean back.
Just to mention, I did a python course online since both use python. Also I have pretty solid experience with Java, we've been using it the past two years in our uni.
Note that I do not need to create a whole new image recognition system, I need to use one to tell me what my camera is seeing.
Also, while it is video, I think that it will be tougher to process the actual video feed. What I thought of is to pick out 1 of 30 frames (assuming a 30fps video feed) and run image recognition on that.
Thanks in advance!
Upvotes: 2
Views: 145
Reputation: 2190
Your project should be fairly straightforward if you read through this tutorial, and specifically the section, "Usage with Python API". The top N outcomes produced by classify_image.py
are converted into human readable text here in this block of code:
top_k = predictions.argsort()[-FLAGS.num_top_predictions:][::-1]
for node_id in top_k:
human_string = node_lookup.id_to_string(node_id)
score = predictions[node_id]
print('%s (score = %.5f)' % (human_string, score))
For your example, you'd want to set FLAGS.num_top_predictions
to 5 and you'd want to accumulate the top 5 human_string
values as something like:
top_k_strings = []
top_k = predictions.argsort()[-FLAGS.num_top_predictions:][::-1]
for node_id in top_k:
human_string = node_lookup.id_to_string(node_id)
top_k_strings.append(human_string)
score = predictions[node_id]
print('%s (score = %.5f)' % (human_string, score))
Finally, you could check if "keys" is one of the top 5 strings that imagenet produced and pass back a boolean as
return "keys" in top_k_strings
Also if you're interested in the full list of human-readable categories you can find them here
With respect to video, you're probably right that you'll have to subsample the video sequence to keep up with the frame rate. Some experimentation and timing tests will give you a feel for the subsampling rate that is needed.
Good luck!
Upvotes: 2