HUNG
HUNG

Reputation: 535

EDITED Looking for SQL improvement

By referencing Collaborative filtering in MySQL? , I have created the following ones:

CREATE TABLE `ub` (
  `user_id` int(11) NOT NULL,
  `book_id` varchar(10) NOT NULL,
  `rate` int(11) NOT NULL,
  PRIMARY KEY (`user_id`,`book_id`),
  UNIQUE KEY `book_id` (`book_id`,`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

insert into ub values (1, 'A', '8'), (1, 'B', '7'), (1, 'C', '10');
insert into ub values (2, 'A', '8'), (2, 'B', '7'), (2, 'C', '10'), (2,'D', '8'), (2,'X', '7');
insert into ub values (3, 'X', '10'), (3, 'Y', '8'), (3, 'C', '10'), (3,'Z', '10');
insert into ub values (4, 'W', '8'), (4, 'Q', '8'), (4, 'C', '10'), (4,'Z', '8');

Then, I can able to get the following table and understand how it works.

create temporary table ub_rank as 
select similar.user_id,count(*) rank
from ub target 
join ub similar on target.book_id= similar.book_id and target.user_id != similar.user_id and target.rate= similar.rate
where target.user_id = 1
group by similar.user_id;

select * from ub_rank;

+---------+------+
| user_id | rank |
+---------+------+
|       2 |    3 |
|       3 |    1 |
|       4 |    1 |
+---------+------+

However, I start to be confused after the following code.

select similar.rate, similar.book_id, sum(ub_rank.rank) total_rank
from ub_rank
join ub similar on ub_rank.user_id = similar.user_id 
left join ub target on target.user_id = 1 and target.book_id = similar.book_id and target.Rate= similar.Rate 
where target.book_id is null
group by similar.book_id
order by total_rank desc, rate desc;

+---------+------------+
| book_id | total_rank |
+---------+------------+
| X       |          4 |
| D       |          3 |
| Z       |          2 |
| Y       |          1 |
| Q       |          1 |
| W       |          1 |
+---------+------------+

(SOLVED) First, I wondering the total rank of X and D why not the same (i.e. 3). Isn't it count the number of books which the same as user A for user B? So, D and X should be 3?!

(SOLVED) Second, how should I modify the code such as the rate can act as an element for the ranking. That is, if the rank of 2 books are the same, then the one with higher marks will place higher rank.

Thanks

EDITED

(1, 'A', '8'), (1, 'B', '7'), (1, 'C', '10');

(2, 'A', '8'), (2, 'B', '7'), (2, 'C', '10'), (2,'D', '8'), (2,'X', '7');

What I wanna do is that, suppose user 1 and 2 have similar behavior ( chosen A,B,C before with matched rating), thus I will recommend D to user A , as it has a higher rate.

Seems the code above not to do so? As, the first ranked is X.

Upvotes: 1

Views: 85

Answers (1)

Akash
Akash

Reputation: 5012

First, I wondering the total rank of X and D why not the same (i.e. 3). Isn't it count the number of books which the same as user A for user B? So, D and X should be 3?!

X has a greater rank as its present for the second user_id and the third user_id, the query gets the total of the rank, in this case 3 (user_id = 2) + 1 (user_id = 3)

Second, how should I modify the code such as the rate can act as an element for the ranking. That is, if the rank of 2 books are the same, then the one with higher marks will place higher rank.

Use the same query and order it by rate desc after the rank, like

select similar.book_id, sum(ub_rank.rank) total_rank
from ub_rank
join ub similar on ub_rank.user_id = similar.user_id 
left join ub target on target.user_id = 1 and target.book_id = similar.book_id and target.Rate= similar.Rate 
where target.book_id is null
group by similar.book_id
order by total_rank desc, rate desc;

Update: As per your requirement, you need to the get the list of books that have the closest match with other users and have the maximum price, try the below query for the same

SELECT
    temp.book_id,
    temp.rate as book_rate
FROM (
        SELECT 
            similar.user_id, 
            COUNT( similar.book_id ) as book_match_count
        FROM 
            ub target
            JOIN ub similar ON  target.book_id= similar.book_id AND target.user_id != similar.user_id
        WHERE 
            target.user_id = 1
        GROUP BY 
            similar.user_id
    ) AS users_with_book_matches
JOIN ub temp ON ( temp.user_id =users_with_book_matches.user_id AND temp.book_id NOT IN ( SELECT book_id FROM ub WHERE ub.user_id = 1 ) ) 
GROUP BY
    temp.book_id
ORDER BY 
    users_with_book_matches.book_match_count DESC,
    temp.rate DESC
limit 5

The above query gets the top 5 closest book matches

Here's the SqlFiddle, make sure to change the user_id at 2 places, hope this serves your purpose

Upvotes: 1

Related Questions