Raydk
Raydk

Reputation: 599

SQL Server : duplicates joining 4 tables

I have four tables

I want output like this

CustomerName   EventName    CheckinDateTime
------------   ----------   ---------------
Peter          Christmas    2012-12-25 00:27:48.350
Peter          Valentines   2013-02-14 01:19:36.113
Peter          Spring       2013-05-20 02:13:53.710

The problem is that I get this result instead

CustomerName   EventName    CheckinDateTime
------------   ----------   ---------------
Peter          Christmas    2012-12-25 00:27:48.350
Peter          Christmas    2013-02-14 01:19:36.113
Peter          Christmas    2013-05-20 02:13:53.710
Peter          Valentines   2012-12-25 00:27:48.350
Peter          Valentines   2013-02-14 01:19:36.113
Peter          Valentines   2013-05-20 02:13:53.710
Peter          Spring       2012-12-25 00:27:48.350
Peter          Spring       2013-02-14 01:19:36.113
Peter          Spring       2013-05-20 02:13:53.710

Each valid record is returned three times

This is the script I use to get the result above

SELECT DISTINCT 
    Customers.Firstname, Events.EventName, CustomerCheckins.CheckinDateTime
FROM         
    CustomerCheckins 
INNER JOIN
    Customers_Events ON CustomerCheckins.CustomerID = Customers_Events.CustomerID 
LEFT OUTER JOIN
    Customers ON Customers_Events.CustomerID = Customers.CustomerID 
RIGHT OUTER JOIN
    Events ON Customers_Events.EventID = Events.EventID
WHERE     
    (Customers_Events.CustomerID = 1887)

I would really appreciate some help to solve this problem. I feel I have tried every combination in the script.

Upvotes: 0

Views: 320

Answers (2)

podiluska
podiluska

Reputation: 51514

You should use grouping instead of distinct

SELECT Customers.Firstname, Events.EventName, MIN(CustomerCheckins.CheckinDateTime)
FROM         
    CustomerCheckins 
        INNER JOIN Customers_Events ON CustomerCheckins.CustomerID = Customers_Events.CustomerID 
        INNER JOIN Customers ON Customers_Events.CustomerID = Customers.CustomerID 
        INNER JOIN Events ON Customers_Events.EventID = Events.EventID
WHERE Customers_Events.CustomerID = 1887
GROUP BY Customers.Firstname, Events.EventName

Upvotes: 2

Klas Lindbäck
Klas Lindbäck

Reputation: 33283

The problem is with your database design.

Your data model doesn't allow you to link a checkin with a specific event.

If your Event table contains startdate and enddate the problem is solvable; then you can add the time constraints in your join.

Edit:

Luckily, there was a startdate and endate in the event table, so the following SQL should work:

          SELECT Customers.Firstname, Events.EventName, MAX(CustomerCheckins.CheckinDateTime)
            FROM CustomerCheckins 
      INNER JOIN Customers_Events 
              ON CustomerCheckins.CustomerID = Customers_Events.CustomerID 
 LEFT OUTER JOIN Customers 
              ON Customers_Events.CustomerID = Customers.CustomerID 
RIGHT OUTER JOIN Events 
              ON Customers_Events.EventID = Events.EventID
             AND events.StartDateTime <= CustomerCheckins.CheckinDateTime
             AND Events.EndDateTime > CustomerCheckins.CheckinDateTime
           WHERE (Customers_Events.CustomerID = 1887)
        GROUP BY Customers.Firstname, Events.EventName

Note that you need the group by in case the user checked in more then one time during an event.

The choice of MIN or MAX for checkindate depends on whether you want to display the first or the last checkin during the event.

Side note:

I'm not very familiar with the SQL Server query optimizer, but it may be difficult for it to create a good execution plan for this query. Make sure you test it after loading the tables with plenty of data if you plan to use it in a live system.

Upvotes: 1

Related Questions