Reputation: 17
I'm creating a Ruby-on-rails where photos can be uploaded, analysed and tagged. We're using Azure's Cognitive Services which can e.g. return a list of tags associated with a given image.
An example of tags for one image in JSON format is:
{"tags":[
{"tag_name":"outdoor","tag_confidence":"99.87"},
{"tag_name":"clothing","tag_confidence":"99.56"},
{"tag_name":"person","tag_confidence":"99.55"},
{"tag_name":"human face","tag_confidence":"93.33"},
{"tag_name":"tree","tag_confidence":"93.17"},
{"tag_name":"smile","tag_confidence":"91.33"}}
Right now we're just storing this as a hash in the table, but later we'd like to be able to do filtering or queries, where we can e.g. get all images with the person-tag. How do I go about creating the model and the tables for this in the best way?
Upvotes: 0
Views: 309
Reputation: 101811
If you're planning on querying the data and it has a regular structure then you might be better served by using actual tables.
# rails g model tag name:string:uniq
class Tag < ApplicationRecord
validate_uniqueness_of :name
has_many :taggings
has_many :photos, through: :taggings
end
# rails g model tagging tag:belongs_to photo:belongs_to confidence:float
# add a unique index on tag_id and photo_id
class Tagging < ApplicationRecord
validate_uniqueness_of :tag_id, scope: :photo_id
belongs_to :tag
belongs_to :photo
end
class Photo < ApplicationRecord
has_many :taggings
has_many :tags, through: :taggings
end
class TaggingService
def initialize(photo)
@photo = photo
@tag_data = get_it_somehow(photo)
end
def perform(photo)
@tag_data.each do |raw|
photo.taggings.create(
confidence: raw["tag_confidence"],
tag: Tag.find_or_create_by(name: raw["tag_name"])
)
end
end
def self.perform(photo)
new(photo).perform
end
end
You can then for example query for a photo with a given tag just by:
Photo.joins(:tags)
.where(tags: { name: 'outdoor' })
Or a set of tags:
Photo.left_joins(:tags)
.group(:id)
.where(tags: { name: ['outdoor', 'clothing', 'tree'] })
.having('COUNT(tags.*) >= 3')
Sure you can do this with a JSON column but the queries are going to be much more difficult to read and you lose all the advantages to having database normalization and having actual models to work with.
One example of this is if you for example want to display tags based on aggregates:
Tag.left_joins(:taggings)
.order('COUNT(taggings.*)', 'AVG(taggings.confidence)')
If you used a JSONB column you would have to extract that from every row on the photos
table with a nightmarish query.
Upvotes: 2