user2751653
user2751653

Reputation: 57

SQL: Querying several tables, only most recent result from one, and all results from another

I have a number of tables and I am trying to pull out data from many of them in one query. I've worked out how to successfully use LEFT JOIN to pull in related data from other tables, with a common ID.

With this query I'm trying to do get data for each contact in our ASSOCIATES_BACKGROUND_DATA table.

Firstly, I am trying to find their most recent meeting in the ASSOCIATES_1_TO_1S.DATE_OF_MEETING (this is be stored as a timestamp, so looking for the largest timestamp only) - I am not getting the correct result, am I using the MAX function correctly?

Secondly, I want to get all records in the ASSOCIATES_ACTION_STEPS for that ASSOCIATE_ID. The right join doesn't seem to be pulling this in either?

Can anyone help? Feel like I'm so close to getting the result for this, but this is really bugging me!

SELECT *, 
        ASSOCIATES_BACKGROUND_DATA.ASSOCIATE_ID as ABG_ASSOCIATE_ID,
            ASSOCIATES_BACKGROUND_DATA.NAME_KNOWN_AS as ABG_NAME_KNOWN_AS, 
        ASSOCIATES_BACKGROUND_DATA.LAST_NAME as ABG_LAST_NAME, 
        USERS.NAME_KNOWN_AS as USERS_NAME_KNOWN_AS, 
        USERS.LAST_NAME as USERS_LAST_NAME, 
        MAX(ASSOCIATES_1_TO_1S.DATE_OF_MEETING)
    FROM ASSOCIATES_BACKGROUND_DATA
 LEFT JOIN ASSOCIATES_1_TO_1S
        ON ASSOCIATES_BACKGROUND_DATA.ASSOCIATE_ID = ASSOCIATES_1_TO_1S.ASSOCIATE_ID
 LEFT JOIN LIST_OF_UNIVERSITIES
        ON ASSOCIATES_BACKGROUND_DATA.UNIVERSITY = LIST_OF_UNIVERSITIES.ID
 LEFT JOIN USERS
        ON ASSOCIATES_BACKGROUND_DATA.PROGRAMME_COORDINATOR = USERS.ID
 RIGHT JOIN ASSOCIATES_ACTION_STEPS
        ON ASSOCIATES_BACKGROUND_DATA.ASSOCIATE_ID = ASSOCIATES_ACTION_STEPS.ASSOCIATE_ID
  GROUP BY ASSOCIATES_1_TO_1S.ASSOCIATE_ID
  ORDER BY `ABG_NAME_KNOWN_AS` ASC

Many thanks in advance for any help or pointers that you can give...

Upvotes: 2

Views: 85

Answers (1)

Sebas
Sebas

Reputation: 21542

I checked your query. The first step for me was to reformat it to make it more readable (very important if you spend time on a query, you don't need any kind of overhead).

After that, I reviewed the logic and came to 2 conclusions:

  • you don't need a right join
  • your grouping algorithm was not quite correct.

Please have a look to this query and tell me if there's something missing:

SELECT 
    abg.ASSOCIATE_ID as ABG_ASSOCIATE_ID,
    abg.NAME_KNOWN_AS as ABG_NAME_KNOWN_AS, 
    abg.LAST_NAME as ABG_LAST_NAME, 
    u.NAME_KNOWN_AS as USERS_NAME_KNOWN_AS, 
    u.LAST_NAME as USERS_LAST_NAME, 
    t.lastmeeting
FROM ASSOCIATES_BACKGROUND_DATA abg 
        LEFT JOIN (
            SELECT 
                abg.ASSOCIATE_ID as id,
                MAX(a1s.DATE_OF_MEETING) AS lastmeeting
            FROM ASSOCIATES_BACKGROUND_DATA abg
                LEFT JOIN ASSOCIATES_1_TO_1S a1s ON abg.ASSOCIATE_ID = a1s.ASSOCIATE_ID
            GROUP BY abg.ASSOCIATE_ID
            ) t ON t.id = abg.ASSOCIATE_ID
        LEFT JOIN ASSOCIATES_ACTION_STEPS aas ON abg.ASSOCIATE_ID = aas.ASSOCIATE_ID
        LEFT JOIN USERS u ON abg.PROGRAMME_COORDINATOR = u.ID
ORDER BY abg.NAME_KNOWN_AS

I try to not use the permissiveness of mysql in regards to its group by function (mysql allows you to get away with a group by instruction not including all the selected fields), and stick to standards. But then it is necessary to make a subquery (see the t alias) precalculating the MAX(DATE) for each associate, to later join it back with the main recordset.

Cheers.

PS: you may add again the join with the universities, I skipped it to isolate the problem only.

PPS: to reflect your comment, I built this query:

SELECT 
    abg.ASSOCIATE_ID as ABG_ASSOCIATE_ID,
    abg.NAME_KNOWN_AS as ABG_NAME_KNOWN_AS, 
    abg.LAST_NAME as ABG_LAST_NAME, 
    u.NAME_KNOWN_AS as USERS_NAME_KNOWN_AS, 
    u.LAST_NAME as USERS_LAST_NAME, 
    t.lastmeeting,
    t2.av AS averagetimedue
FROM ASSOCIATES_BACKGROUND_DATA abg 
        LEFT JOIN (
            SELECT 
                abg.ASSOCIATE_ID as id,
                MAX(a1s.DATE_OF_MEETING) AS lastmeeting
            FROM ASSOCIATES_BACKGROUND_DATA abg
                LEFT JOIN ASSOCIATES_1_TO_1S a1s ON abg.ASSOCIATE_ID = a1s.ASSOCIATE_ID
            GROUP BY abg.ASSOCIATE_ID
            ) t ON t.id = abg.ASSOCIATE_ID
        LEFT JOIN (
            SELECT
                (AVG(DATE_DUE)+AVG(DATE_COMPLETED))/2 AS av,
                ASSOCIATE_ID
            FROM ASSOCIATES_ACTION_STEPS
            GROUP BY ASSOCIATE_ID
            ) t2 ON abg.ASSOCIATE_ID = t2.ASSOCIATE_ID
        LEFT JOIN ASSOCIATES_ACTION_STEPS aas ON abg.ASSOCIATE_ID = aas.ASSOCIATE_ID
        LEFT JOIN USERS u ON abg.PROGRAMME_COORDINATOR = u.ID
ORDER BY abg.NAME_KNOWN_AS

I'm not sure of your definition of average time between DATE_DUE and DATE_COMPLETED though.

Upvotes: 3

Related Questions