Dan
Dan

Reputation: 77

Select only unique records from multiple columns

I have a table that logs downloads by IP, version and platform. Looking at the table manually I see a lot of duplicates where all 3 of those values are the same. (user is probably just impatient) I'd like to use a SELECT statement that filters out the duplicates and only returns one of the entries if all 3 of those values are the same. Even more advanced, if possible, I also have a date/time field that uses CURRENT_TIMESTAMP. Would be nice if I could include duplicates if they are from different days, but not different times. So I can see if the same user is downloading again on a different day.

I'm mainly just trying to get statistics on how many unique people download each version each day. The structure of the DB table is simple...

key (AUTO_INCREMENT), date (CURRENT_TIMESTAMP), ip, user_agent, platform, version

The software has a Windows and Mac version (platform) and I offer both the current version and a few distinct past versions that were before major changes.

Upvotes: 0

Views: 56

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1269463

Is this what you want? It returns the first record on each date for the ip/platform/version combination:

select t.*
from <tablename> t
where t.datetime = (select min(t2.datetime)
                    from <tablename> t2
                    where t2.ip = t.ip and
                          t2.platform = t.platform and
                          t2.version = t.version and
                          date(t2.datetime) = date(t.datetime)
                   );

Upvotes: 0

Steve T
Steve T

Reputation: 572

Just group by the fields you want to exclude from being duplicated, like

SELECT ip, platform, version, COUNT(*) AS number_of_tries, max(download_date) AS last_download_date 
FROM downloads
GROUP BY ip, platform, version, DATE(download_date)

It would then be relatively easy to do some more advanced filtering over the result grouping by day, etc.

Upvotes: 1

Zaynul Abadin Tuhin
Zaynul Abadin Tuhin

Reputation: 32003

mysql 8.0+ version you can use row_number()

select * from (select *,
row_number()over(partition by ip,platform,date(datetime) order by datetime) rn
       from table_name 
) a where a.rn=1

Upvotes: 0

Related Questions