Caerus
Caerus

Reputation: 674

Cross Join in Hive

I'm trying to create a new column time_period while running the query below. If the date difference between a given transaction and the most recent transaction in the reference table is fewer than 7 days, then mark it as a recent transaction, else mark it as an old transaction.

However, the query below is generating an error in the subquery associated with the cross join. The error is "Cannot recognize input near 'select' '(''max'

SELECT
    c.*
    FROM(
        SELECT
         a.acct_nb,
         a.txn_date,
         a.txn_amt,
         (CASE WHEN datediff(b.most_recent_txn,a.txn_date)<7 THEN 'recent' ELSE 'old' END) AS time_period
         FROM db.t1 a
         CROSS JOIN(
              SELECT max(txn_date) AS most_recent_txn --ERROR OCCURS HERE
              FROM db.t1 b)
        )c
        WHERE c.time_period='new';

What could be causing this error?

Upvotes: 1

Views: 6609

Answers (2)

GMB
GMB

Reputation: 222702

You don't need a join for this. You can just use window functions:

SELECT 
    acct_nb,
    txn_date, 
    txn_amt,
    CASE WHEN DATEDIFF(MAX(txn_date) OVER(), txn_date) < 7 
        THEN 'recent' 
        ELSE 'old' 
    END AS time_period
FROM db.t1

If you need to filter only on recent transactions, then you can use a subquery:

SELECT *
FROM (
    SELECT acct_nb, txn_date, txn_amt, MAX(txn_date) OVER() max_txn_date
    FROM db.t1
) t
WHERE DATEDIFF(MAX(txn_date) OVER(), txn_date) < 7 

Upvotes: 2

forpas
forpas

Reputation: 164214

The alias b should be applied to the cross joined subquery and not to the table db.t1 inside the subquery:

SELECT c.*
FROM (
  SELECT a.acct_nb, a.txn_date, a.txn_amt,
         CASE WHEN datediff(b.most_recent_txn, a.txn_date) < 7 THEN 'recent' ELSE 'old' END AS time_period
  FROM db.t1 a
  CROSS JOIN (
    SELECT max(txn_date) AS most_recent_txn
    FROM db.t1 
  ) b
) c
WHERE c.time_period='new';  

Also, there is no branch of your CASE expression returning 'new' so the last WHERE clause will filter out all rows.

Upvotes: 1

Related Questions