Reputation: 694
I am running below query in spark.sql
which has IN
clause with order by
statement
Here is my original data:
Query:
SELECT
*, ROW_NUMBER() OVER (PARTITION BY id ORDER BY category IN ('TV','laptop'), created_at desc) as rn
FROM table2
Result:
Can you help me to interpret this result? I have added 'TV' and 'laptop' in IN
clause then it should be at top row (rn = 1 to 4)
I have observed same behaviour in Snowflake as well. After adding IN ('TV','laptop')
in the query getting this unpredicted result.
Upvotes: 2
Views: 58
Reputation: 694
I have played around this much and then got the logic is when we use IN
clause it is return binary result 0
or 1
When I have added IN ('TV','laptop')
into the query. It will check all the result which has this category and assigned it 1
to those records rest will be 0
. Now based on this binary all the 0
records will come first and the records with 1
due to it is ASC
by default.
After removing 'laptop' from query, all the records with TV
will be ordered at the last due to TV
ones are set to 1
rest will be set to 0
I hope you will get clear understanding from this description.
Upvotes: 2