Minadorae
Minadorae

Reputation: 301

SQL Count Distinct Using Multiple Unique Identifiers

My company ran a series of TV ads and we're measuring the impact by changes in our website traffic. I would like to determine the cost per session we saw generated, based on the cost of each ad.

The trouble is, the table this is referencing has duplicate data, so my currently cost_per_session isn't counting right.

What I have so far:

.

SELECT DISTINCT tadm.timestamp AS ad_time
, tadm.media_outlet AS media_outlet
, tadm.program AS program
, tadm.client_net_cleared AS client_net_cleared
, SUM(tadm.before_ad_sum) AS before_ad_sessions
, SUM(tadm.after_ad_sum) AS after_ad_sessions
, (SUM(tadm.after_ad_sum) - SUM(tadm.before_ad_sum)) AS diff
, CASE WHEN tadm.client_net_cleared = 0 THEN null
    WHEN (SUM(tadm.after_ad_sum) - SUM(tadm.before_ad_sum)) <1 THEN null
    ELSE (tadm.client_net_cleared/(SUM(tadm.after_ad_sum) - SUM(tadm.before_ad_sum)))
    END AS cost_per_session

FROM tableau.km_tv_ad_data_merged tadm

GROUP BY ad_time,media_outlet,program,client_net_cleared

Sample data:

 ad_time             | media_outlet  | program        | client_net_cleared | before_ad_sessions | after_add_sessions | diff | cost_per_session
---------------------|---------------|----------------|--------------------|--------------------|--------------------|------|-----------------
 2016-12-09 22:55:00 | DIY           |                |                970 |                 55 |                 72 |   17 |           57.05
 2016-12-11 02:22:00 | E!            | E! News        |                388 |                 25 |                 31 |    6 |           64.66
 2016-12-19 21:15:00 | Cooking       | The Best Thing |                428 |                 70 |                 97 |   27 |           15.85
 2016-12-22 14:01:00 | Oxygen        | Next Top Model |                285 |                 95 |                148 |   53 |           5.37
 2016-12-09 22:55:00 | DIY           |                |                970 |                 55 |                 72 |   17 |           57.05
 2016-12-04 16:13:00 | Headline News | United Shades  |               1698 |                 95 |                137 |   42 |           40.42

What I need: Only count one instance of each ad when calculating cost_per_session.

EDIT: Fixed the query, had a half completed row where I was failing at doing this before asking the question. :)

Upvotes: 0

Views: 103

Answers (1)

O. Jones
O. Jones

Reputation: 108776

Get rid of the DISTINCT in SELECT DISTINCT in the first line of your query. It makes no sense in a GROUP BY query.

If your rows are entirely duplicate, try deduplicating the table before you put it into the GROUP BY grinder by replacing

FROM tableau.km_tv_ad_data_merged tadm

with

FROM ( SELECT DISTINCT timestamp, media_outlet, program,
                       client_net_cleared,
                       before_ad_sum, after_ad_sum
         FROM tableau.km_tv_ad_data_merged
     ) tadm

Upvotes: 1

Related Questions