Sawyer Keels
Sawyer Keels

Reputation: 67

SQL joining on 2 separate date columns

I am having trouble wrapping my head around a problem and the more I think about it the further away I get.

I have two tables that log user visits into two separate applications. For simplicities sake, let's say there are only 2 columns, timestamp and userid (key). Each row counts as a visit, so technically 3 columns since there is one derived for visits.

I am trying to create a table that in each row records the userid, the day (to_date format), total visits to app 1 and total visits to app 2 for that user on that day.

The issue is, when I join the tables together on userid and date, I get missing data. For example, if a user logged into application A on day X, but did not log into application B on day X, then joining on userid and day causes this record to be omitted since the date only exists in Application A's table.

How can I set this up where the final table would have a singular date column, userid, visits to app A and visits to app B, regardless if the user visited both applications on said day?

Hope this made sense, happy to elaborate if needed. Below is sort of what my SQL looks like as of now. Any thoughts appreciated!

with app_a_visits as (
      select to_date(timestamp) as date, userid, count(*) as visits
      from app_a),

app_b_visits as (
  select to_date(timestamp) as date, userid, count(*) as visits
  from app_b)

select a.date, a.userid, a.visits as app_a_visits, b.visits as app_b_visits
from app_a_visits a
full outer join app_b_visits b on a.userid = b.user_id and a.date = b.date;

Upvotes: 0

Views: 797

Answers (1)

Simeon Pilgrim
Simeon Pilgrim

Reputation: 25903

Use FULL OUTER JOIN and NVL/COALESCE

with app_a_visits(date, userid,visits) as (
    select * from values
    ('2022-01-01'::date, 1, 100),
    ('2022-01-03'::date, 1, 100),
    ('2022-01-05'::date, 1, 100)
), app_b_visits(date, userid,visits) as (
    select * from values
    ('2022-01-02'::date, 1, 200),
    ('2022-01-03'::date, 1, 200),
    ('2022-01-04'::date, 1, 200)
)
select 
    NVL(a.date, b.date) as date,
    NVL(a.userid, b.userid) as userid,
    a.visits as app_a_visits, 
    b.visits as app_b_visits
from app_a_visits a
full outer join app_b_visits b 
    on a.userid = b.userid and a.date = b.date
ORDER BY 1,2;
DATE USERID APP_A_VISITS APP_B_VISITS
2022-01-01 1 100 null
2022-01-02 1 null 200
2022-01-03 1 100 200
2022-01-04 1 null 200
2022-01-05 1 100 null

Upvotes: 2

Related Questions