bolvo
bolvo

Reputation: 359

Selecting from 2 tables with possibly corresponding dates

I am looking for the correct query for my mysql db that has 2 seperate tables for lengths and weights. I want to have the result returned as 1 query with 3 columns: datetime, length and weight. The query should also allow to specify the user. Eg.:

Table heights:

id  user_id  created_on           height 
 1  2        2019-01-01 00:00:01  180  
 2  2        2019-01-02 00:00:01  181  
 3  3        2019-01-03 00:00:01  182  
 4  3        2019-01-04 00:00:01  183  
 5  2        2019-01-07 00:00:01  184  

Table weights:

 id  user_id  created_on           weight
 1   2        2019-01-01 00:00:01  80   
 2   2        2019-01-04 00:00:01  81   
 3   3        2019-01-05 00:00:01  82   
 4   3        2019-01-06 00:00:01  83   
 5   2        2019-01-07 00:00:01  84  

I am looking to get the following result with a single query:

user_id  created_on             weight  height 
2        2019-01-01 00:00:01    80      180  
2        2019-01-02 00:00:01    null    181  
2        2019-01-04 00:00:01    81      null  
2        2019-01-07 00:00:01    84      184  

I have tried working with JOIN statements but fail to get the required result. This join statement

SELECT w.* , h.* FROM weight w 
  JOIN height h 
    ON w.created_on=h.created_on
    AND w.user_id=h.user_id AND user_id=2

will return only those results that have both a height and weight item for user_id and created_on A full outer join would do the trick, however this is not supported by mysql. The following query seems to be returning the required result, however it is very slow:

SELECT r.* FROM 
    (SELECT w.user_id as w_user, w.created_on as weightdate, w.value as weight, h.created_on as heightdate ,h.user_id as h_user, h.value as height FROM weight w
    LEFT JOIN height h ON w.user_id = h.user_id 
    AND w.created_on=h.created_on 
    UNION 
    SELECT w.user_id as w_user, w.created_on as weightdate, w.value as weight, h.created_on as heightdate ,h.user_id as h_user, h.value as height FROM weight w
    RIGHT JOIN height h ON w.user_id = h.user_id 
    AND w.created_on=h.created_on ) r 
WHERE h_user=2 OR w_user =2

The query takes more than 3 seconds if the 2 tables have around 3000 entries. Is there a way to speed this up, possibly using a different approach?

For extra bonus points: is it possible to allow for a small time discrepancy between both created_on datetimes? (eg. 10 minutes or within the same hour). Eg. if column weight has an entry for 2019-01-01 00:00:00 and table height has an entry for height at 2019-01-01 00:04:00 they appear in the same row.

Upvotes: 1

Views: 42

Answers (2)

Nick
Nick

Reputation: 147146

Instead of using a calendar table to select dates of interest, you can use a UNION to select all the distinct dates from the heights and weights tables. To deal with matching times within an hour of each other, you can compare the times using TIMESTAMPDIFF and truncate the created_on time to the hour. Since this might create duplicate entries, we add the DISTINCT qualifier to the query:

SELECT DISTINCT COALESCE(h.user_id, w.user_id) AS user_id,
       DATE_FORMAT(COALESCE(h.created_on, w.created_on), '%y-%m-%d %H:00:00') AS created_on,
       w.weight,
       h.height
FROM (SELECT created_on FROM heights
      UNION
      SELECT created_on FROM weights) d
LEFT JOIN heights h ON ABS(TIMESTAMPDIFF(HOUR, h.created_on, d.created_on)) = 0 AND h.user_id = 2
LEFT JOIN weights w ON ABS(TIMESTAMPDIFF(HOUR, w.created_on, d.created_on)) = 0 AND w.user_id = 2
WHERE h.user_id IS NOT NULL OR w.user_id IS NOT NULL
ORDER BY created_on

Output (from my demo, where I've modified your times to allow for matching within the hour):

user_id     created_on          weight  height
2           19-01-01 01:00:00   80      180
2           19-01-02 00:00:00           181
2           19-01-04 04:00:00   81  
2           19-01-07 06:00:00   84      184

Demo on dbfiddle

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520968

This is probably best handled using a calendar table, containing all dates of interest for the query. We can start the query with the calendar table, then left join to the heights and weights tables:

SELECT
    COALESCE(h.user_id, w.user_id) AS user_id,
    d.dt AS created_on,
    w.weight,
    h.height
FROM
(
    SELECT '2019-01-01 00:00:01' AS dt UNION ALL
    SELECT '2019-01-02 00:00:01' UNION ALL
    SELECT '2019-01-03 00:00:01' UNION ALL
    SELECT '2019-01-04 00:00:01' UNION ALL
    SELECT '2019-01-05 00:00:01' UNION ALL
    SELECT '2019-01-06 00:00:01' UNION ALL
    SELECT '2019-01-07 00:00:01'
) d
LEFT JOIN heights h
    ON d.dt = h.created_on AND h.user_id = 2
LEFT JOIN weights w
    ON d.dt = w.created_on AND w.user_id = 2
WHERE
    h.user_id IS NOT NULL OR w.user_id IS NOT NULL
ORDER BY
    d.dt;

enter image description here

Demo

Upvotes: 0

Related Questions