Asik
Asik

Reputation: 7977

MySql Query- Exclude duplicate records in count based on date

I have table called post_likes which contain the following fields

Created_at is based on UTC time zone.

Our business logic is that user can add LIKE one time per day (based on PST timezone) for the post. Meaning, 1 LIKE from FB and 1 LIKE from Linkedin.

I have a condition in application side to check and restrict peoples who are trying to give like same day. Somehow, some users added duplicate LIKES for the same day (it might be code or web server issue. Just leave this for now).

My current query is as follows to get the count of each post

SELECT `id`,  
       `title`,  
       Sum(IF(`type` = 'FB_LIKES', like_count, 0)) AS fb_like_counts, 
       Sum(IF(`type` = 'LINKEDIN_LIKES', like_count, 0)) AS linkedin_like_counts, 
       Sum(`like_count`) AS total_like_counts 
FROM   `post_likes` 
GROUP  BY `title`
ORDER  BY `total_like_counts` DESC;

Now, I want to get DISTINCT counts. because some of the post has duplicate likes for the same day. Here same user can like next day.

Note: Sorry for my bad english and thanks in advance

Upvotes: 0

Views: 316

Answers (3)

Asik
Asik

Reputation: 7977

Sorry for the late update. Below is my final query which gives the correct result

SELECT `id`, 
   `title`, 
   Sum(IF(`type` = 'FB_LIKES', like_count, 0))       AS fb_like_counts, 
   Sum(IF(`type` = 'LINKEDIN_LIKES', like_count, 0)) AS linkedin_like_counts, 
   Sum(`like_count`)                                 AS 
   total_like_counts  FROM   (SELECT *, 
           Date_format(Convert_tz(created_at, '+00:00', '-8:00'), '%Y-%m-%d' 
   ) AS date_pst 
    FROM   `post_likes` 
    GROUP  BY type, 
              title, 
              date_pst, 
              uid_hash) AS D1 
    GROUP  BY `title` 
    ORDER  BY `total_like_counts` DESC; 

Upvotes: 0

Kavindu Wijesuriya
Kavindu Wijesuriya

Reputation: 157

You just have to make a few adjustments to your query to select distinct date from your table. For the following solution I added a new select statement to return the same post_likes table but with distinct dates. This will allow me to run my query on a filtered set of post_likes

SELECT `id`,  
       `title`,
       `user_hash` as liked_user,
       Sum(IF(`type` = 'FB_LIKES', like_count, 0)) AS fb_like_counts, 
       Sum(IF(`type` = 'LINKEDIN_LIKES', like_count, 0)) AS linkedin_like_counts, 
       Sum(`like_count`) AS total_like_counts 
FROM   (SELECT * FROM post_likes GROUP BY DATE_FORMAT(`created_date`, '%m %d %Y'), user_hash, type) AS post_likes
GROUP  BY `title`, DATE_FORMAT(`created_date`, '%m %d %Y'), user_hash
ORDER  BY `total_like_counts` DESC;

You can use the following query to recreate the schema

CREATE TABLE `post_likes` (
  `id` int(11) NOT NULL,
  `title` varchar(255) DEFAULT NULL,
  `post_id` int(11) NOT NULL,
  `type` enum('FB_LIKES','LINKEDIN_LIKES') NOT NULL,
  `user_hash` varchar(30) NOT NULL,
  `like_count` int(11) NOT NULL,
  `created_date` timestamp NULL DEFAULT CURRENT_TIMESTAMP
);

insert into post_likes VALUES (
  1, 'A', 1, 'FB_LIKES', 'ABC', 1, '2018-06-25 05:49:41'
);

insert into post_likes VALUES (
2, 'A', 1, 'FB_LIKES', 'DEF', 1, '2018-06-25 05:50:01'
);

insert into post_likes VALUES (
3, 'A', 1, 'LINKEDIN_LIKES', 'ABC', 1, '2018-06-25 05:50:16'
);

insert into post_likes VALUES (
4, 'A', 1, 'LINKEDIN_LIKES', 'DEF', 1, '2018-06-25 05:50:23'
);

insert into post_likes VALUES (
5,'A', 1, 'LINKEDIN_LIKES', 'ABC', 1, '2018-06-25 05:50:16'
);

insert into post_likes VALUES (
6, 'A', 1, 'FB_LIKES', 'DEF', 1, '2018-06-25 05:50:01'
);

Upvotes: 1

Prasad Wargad
Prasad Wargad

Reputation: 765

Let's say below is the probable table structure

CREATE TABLE `post_likes` (
  `id` int(11) NOT NULL,
  `title` varchar(255) DEFAULT NULL,
  `post_id` int(11) NOT NULL,
  `type` enum('FB_LIKES','LINKEDIN_LIKES') NOT NULL,
  `user_hash` varchar(30) NOT NULL,
  `like_count` int(11) NOT NULL,
  `created_date` timestamp NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;  

Below is the sample data:

INSERT INTO `post_likes` (`id`, `title`, `post_id`, `type`, `user_hash`, `like_count`, `created_date`) VALUES
(1, 'A', 1, 'FB_LIKES', 'ABC', 1, '2018-06-25 05:49:41'),
(2, 'A', 1, 'FB_LIKES', 'DEF', 1, '2018-06-25 05:50:01'),
(3, 'A', 1, 'LINKEDIN_LIKES', 'ABC', 1, '2018-06-25 05:50:16'),
(4, 'A', 1, 'LINKEDIN_LIKES', 'DEF', 1, '2018-06-25 05:50:23'),
(5, 'A', 1, 'FB_LIKES', 'ABC', 1, '2018-06-26 05:50:38');

Here comes the query with date field to have unique records day-wise considering you don't have any duplicate records present in table:

SELECT `id`, `title`, Sum(IF(`type` = 'FB_LIKES', like_count, 0)) AS fb_like_counts, Sum(IF(`type` = 'LINKEDIN_LIKES', like_count, 0)) AS linkedin_like_counts, Sum(`like_count`) AS total_like_counts, DATE_FORMAT(`created_date`, '%Y-%m-%d') AS `liked_date` 
FROM `post_likes` 
GROUP BY `title`, `liked_date`
ORDER BY `total_like_counts` DESC;

Upvotes: 0

Related Questions