Reputation: 979
There is my entity table and tags bound to it through another table (many-to-many relation).
I have an entity selected and my goal is to find a set of entities which have as much similar tags with it as possible. This set must be ordered by 'similarity' – amount of tags each entity shares with chosen entity. Tags are similar if they have the same ids.
I'm wondering if there's elegant and fast way to do that with single query.
The only solution I see now is to fetch all the tag-entity relations and compute their similarity in my application, and then make another database query to select what i've computed, but it doesn't look very graceful.
Database structure:
entity id ...
tag id name
entity_tag entity_id tag_id
Update: final solution for MySQL.
So I have tables for paintings, tags and painting_tag relation. This query fetches similar paintings and their 'similarity index' for previously selected painting.
SELECT site_painting.*, Count(tr.tag_id) as similarity
From site_painting_tag_relation as tr
Inner Join site_painting_tag_relation as tr2 ON ( tr2.tag_id = tr.tag_id and tr2.painting_id = :id )
Left join site_painting on site_painting.id=tr.painting_id
Where tr.painting_id <> :id
Group By tr.painting_id
Having Count(*) > 0
Order By Count(*) DESC, tr.painting_id limit 1
Upvotes: 1
Views: 1802
Reputation: 1959
OK, The info on db structure helps a lot --
entity id ...
entity_tag entity_id tag_id
tag id name
Lets look at some example values--
entity
id=100
id=...
id=199
entity_tag
100, 3
100, 5
101, 1
102, 7
...
199, 3
199, 7
tag
id=1
id=...
id=10
So if we denormalize the entity_tag we have
100 3,5
101 1
102 7
199 3 7
And a similarity index of 199 to the rest, on a scale of 0 to 10
100 1 in common
101 0 in common
102 1 in common
199 self, no comparison
And if I have it right, we want to display 100 and 102 being the highest, nu??
Here is a stab at the SQL-- It might be something like ---
SELECT TOP 10
FROM
(SELECT
allET.EID,
Count(*) as Similarity
From entity_tag as allET
Left Join
(Select * From entity_tag Where EID = myEID ) as myET
On allET.TID = myET.TID
Where allET.EID <> myEID
Group By allET.EID
Having Count(*) > 0
Order By Count(*) DESC, allET.EID
)
Upvotes: 2