Reputation: 2706
I have two tables. I want to get similar record with percentage.
Please look my code:
users
+----+------------+----------+
| id | fname | lname |
+----+------------+----------+
| 1 | sanjib | pradhan |
| 2 | aruna | avipsa |
| 3 | chinu | sahu |
| 4 | sradha | behera |
| 5 | debasish | dash |
| 6 | pikun | behera |
| 7 | pradep | hardy |
+----+------------+----------+
user_tags
+-----+---------+------------+
| id | user_id | tag |
+-----+---------+------------+
| 1 | 1 | php |
| 2 | 1 | java |
| 3 | 2 | java |
| 4 | 1 | dotnet |
| 5 | 2 | oracle |
| 6 | 3 | oracle |
| 7 | 3 | python |
| 8 | 4 | sql |
| 9 | 3 | java |
+-----+---------+------------+
There have two table users
and user_tags
I need 3 records percentage. user_id 1 having 3 tags php, java, dotnet
user_id 2 having 2 tags, user_id 3 having 3 tags and 4 having 1 tag.
Suppose my query based on id=2 in the users table. There have 2 tags on the tags table. I need these two tags with similar to other users tag.
user_id=2
tags=java,oracle
user_id=1
tags=php,java,dotnet (matches tag java is exist - percent should be number of matches tag/total tag) means 1/3
user_id=3
tags=oracle, python, java
2 tags matches(java and oracle)
percent should be 2/3
user_id = 4, tag=sql (No matches tag so percent is 0%)
How to manage this by using MySQL and php code?
Upvotes: 0
Views: 321
Reputation: 9010
We can get the answer you want by left joining all the users with their tags, to user 2 with their tags. Then we just count the appropriate values to get your percentage.
select u1.id,
count(ut2.tag) matches,
count(ut1.tag) total,
count(ut2.tag) / count(ut1.tag) pct
from users u1
inner join user_tags ut1
on u1.id = ut1.user_id
left join
users u2
inner join user_tags ut2
on u2.id = ut2.user_id
on ut2.tag = ut1.tag
and u2.id = 2
where u1.id <> 2
group by u1.id
How does this work? Before the group by and aggregate functions, the rows returned look like this:
id fname lname id user_id tag id fname lname id user_id tag
1 sanjib pradhan 2 1 java 2 aruna avipsa 3 2 java
3 chinu sahu 9 3 java 2 aruna avipsa 3 2 java
3 chinu sahu 6 3 oracle 2 aruna avipsa 5 2 oracle
1 sanjib pradhan 1 1 php (null) (null) (null) (null) (null) (null)
1 sanjib pradhan 4 1 dotnet (null) (null) (null) (null) (null) (null)
3 chinu sahu 7 3 python (null) (null) (null) (null) (null) (null)
4 sradha behera 8 4 sql (null) (null) (null) (null) (null) (null)
As you can see - where the tag for the user on the left does not match the tag for the user on the right, null values are entered. The count
function in mysql ignores null values, and does not count them. Because of this, when we group by u1.id (the id value at the far left) and count the number of tags they have (ut1.tag
), we get the total of all of their tags. However, when we count the value for ut2.tag
, we only get the total that aren't null
, ie, only the ones that match. This allows us to get your percentage.
edit for comment
You added the requirement in the comments to be able to compare against multiple users at once, since this means it may match the same tag more than once, we need to only count the distinct elements.
select u1.id,
count(distinct ut2.tag) matches,
count(ut1.tag) total,
count(distinct ut2.tag) / count(ut1.tag) pct
from users u1
inner join user_tags ut1
on u1.id = ut1.user_id
left join
users u2
inner join user_tags ut2
on u2.id = ut2.user_id
on ut2.tag = ut1.tag
and u2.id IN (2,3)
where u1.id NOT IN (2,3)
group by u1.id
having count(ut2.tag) > 0;
With your updated fiddle here
Upvotes: 3