MCM13
MCM13

Reputation: 275

How to use 'Distinct' for just one column?

I have a query checking the visits from some "locations" table I have. If the user signed up with a referral of "emp" or "oth", their first visit shouldn't count but the second visit and forward should count.

I'm trying to get a count of those "first visits" per location. Whenever they do a visit, I get a record on which location it was.

The problem is that my query is counting correctly, but some users have visits on different locations. So instead of just counting one visit for that location (the first one), is adding one per location where a user has done a visit.

This is my query

SELECT COUNT(DISTINCT CASE WHEN customer.ref IN ('emp', 'oth') THEN customer.id END) as visit_count, locations.name as location FROM locations
LEFT JOIN visits ON locations.location_name = visits.location_visit_name
LEFT JOIN customer ON customer.id = visits.customer_id
WHERE locations.active = true
GROUP BY locations.location_name, locations.id;

The results I'm getting are

visit_count |  locations
-------------------------
    7       |      Loc 1
    3       |      Loc 2
    1       |      Loc 3

How it should be:

visit_count |  locations
-------------------------
    6       |      Loc 1
    2       |      Loc 2
    1       |      Loc 3

Because 2 of these people have visits on both locations, so its counting one for each location. I think the DISTINCT is also doing it for the locations, when it should be only on the counting for the customer.id

Is there a way I can add something to my query to just grab the location for the first visit, without caring they have done other visits on other locations?

Upvotes: 0

Views: 36

Answers (2)

GMB
GMB

Reputation: 222462

If I followed you correctly, you want to count only the first visit of each customer, spread by location.

One solution would be to use a correlated subquery in the on clause of the relevant join to filter on first customer visits. Assuming that column visit(visit_date) stores the date of each visit, you could do:

select 
    count(c.customer_id) visit_count,
    l.name as location
from locations l
left join visits v 
    on  l.location_name = v.location_visit_name
    and v.visit_date = (
        select min(v1.visit_date)
        from visit v1
        where v1.customer_id = v.customer_id
    )
left join customer c 
    on  c.id = v.customer_id 
    and c.ref in ('emp', 'oth')
where l.active = true
group by l.location_name, l.id;

Side notes:

  • properly fitering on the first visit per customer avoids the need for distinct in the count() aggregate function

  • table aliases make the query more concise and easier to understand; I recommend to use them in all queries

  • the filter on customer(ref) is better placed in the where clause than as a conditional count criteria

Upvotes: 1

ScaisEdge
ScaisEdge

Reputation: 133360

Try moving the when condition in where clause

SELECT COUNT( distinct customer.id)  as visit_count
  , locations.name as location 
FROM locations
LEFT JOIN visits ON locations.location_name = visits.location_visit_name
LEFT JOIN customer ON customer.id = visits.customer_id
WHERE locations.active = true
  AND customer.ref IN ('emp', 'oth')
GROUP BY locations.location_name;c

Upvotes: 1

Related Questions