Daniel Vera
Daniel Vera

Reputation: 97

Unsupported aggregate subquery with non-equality correlated predicates

I'm new to Impala - We are getting following error message when we join two tables as below. Impala dont support >= on correlated subquer - any suggestion how we can do it

CREATE TABLE database.test_dve (
  campo1 INT,
  campo2 TIMESTAMP,
  campo3 INT
)
STORED AS PARQUET;

insert into database.test_dve values(1,'2019-09-30',1);
insert into database.test_dve values(2,'2019-09-30',1);
insert into database.test_dve values(3,'2019-09-30',1);

CREATE TABLE database.test_dve2 (
  campo1 INT,
  campo2 TIMESTAMP
)
STORED AS PARQUET;

insert into database.test_dve2 values(1,'2019-10-30');

select *
from database.test_dve a
WHERE 0 >= (SELECT count(*) FROM database.test_dve2 b
WHERE a.campo1 = b.campo1
AND CAST(from_timestamp(a.campo2, 'yyyyMM') as INT) >= CAST(from_timestamp(b.campo2, 'yyyyMM') as INT)
);

Error Message

[HY000] [Cloudera][ImpalaODBC] (110) Error while executing a query in Impala: [HY000] : AnalysisException: Unsupported aggregate subquery with non-equality correlated predicates: CAST(from_timestamp(a.campo2, 'yyyyMM') AS INT) >= CAST(from_timestamp(b.campo2, 'yyyyMM') AS INT)

Thanks

Upvotes: 1

Views: 722

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269773

Try replacing this with not exists:

select *
from database.test_dve a
WHERE NOT EXISTS (SELECT 1
                  FROM database.test_dve2 b
                  WHERE a.campo1 = b.campo1 AND
                        CAST(from_timestamp(a.campo2, 'yyyyMM') as INT) >= CAST(from_timestamp(b.campo2, 'yyyyMM') as INT)
                 );

Upvotes: 0

Related Questions