Reputation: 329
Hive sort by
and order by
commands are used to fetch data in sorted order.
For Example:
Sort by
hive> SELECT E.EMP_ID FROM Employee E SORT BY E.empid;
Order by
hive> SELECT E.EMP_ID FROM Employee E order BY E.empid;
What is the difference between sort by
and order by
queries in Hive SQL?
Upvotes: 3
Views: 44285
Reputation: 70143
SORT BY
does not exist in typical SQL implementations. Details would presumably be found in Hive documentation, but Gaurang's answer indicates that SORT BY
is per-reducer.
ORDER BY
is in (to my knowledge) all SQL implementations. Wikipedia has an article explaining it.
Upvotes: 2
Reputation: 1958
Hive supports SORT BY which sorts the data per reducer. The difference between "order by" and "sort by" is that the former guarantees total order in the output while the latter only guarantees ordering of the rows within a reducer. If there are more than one reducer, "sort by" may give partially ordered final results.
Note: It may be confusing as to the difference between SORT BY alone of a single column and CLUSTER BY. The difference is that CLUSTER BY partitions by the field and SORT BY if there are multiple reducers partitions randomly in order to distribute data (and load) uniformly across the reducers.
Basically, the data in each reducer will be sorted according to the order that the user specified. The following example shows
SELECT key, value FROM src SORT BY key ASC, value DESC
Upvotes: 15
Reputation: 5578
There is nothing like "sortBy" in SQL
For sorting of data in SQL the ORDER BY
clause is used which allows you to sort the records in your result set.
And it can only be used in SELECT
statements.
e.g.:
SELECT columns FROM tables WHERE predicates ORDER BY column ASC/DESC;
The ORDER BY
clause sorts the result set based on the columns specified.
ASC
indicates ascending order.
DESC
indicates descending order.
The default is ASC
Upvotes: -1