Reputation: 1491
I have SQL like below
SELECT LIMIT,
COL1,
COL2,
COL3
FROM
(SELECT ROW_NUMBER () OVER (ORDER BY COL5 DESC) AS LIMIT,
FROM_UNIXTIME(COL_DATETIME,'dd-MM-yyyy HH24:mi:ss') COL1,
CASE WHEN COL6 IN ('A', 'B') THEN A_NUMBER ELSE B_NUMBER END AS COL2,
COL3
FROM DBNAME.TABLENAME
WHERE COL7 LIKE ('123456%')
AND COL_DATETIME BETWEEN 20150201000000 AND 20150202235959) X
I can execute it successfully from hive. But I want to execute it from spark. I have created a spark-sql-hive context like below
scala> val sqlHContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlHContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@71138de5
Then I tried to execute the above sql query like below
sqlHContext.sql("SELECT LIMIT, COL1, COL2, COL3 FROM (SELECT ROW_NUMBER () OVER (ORDER BY COL5 DESC) AS LIMIT, FROM_UNIXTIME(COL_DATETIME,'dd-MM-yyyy HH24:mi:ss') COL1, CASE WHEN COL6 IN ('A', 'B') THEN A_NUMBER ELSE B_NUMBER END AS COL2, COL3 FROM DBNAME.TABLENAME WHERE COL7 LIKE ('123456%') AND COL_DATETIME BETWEEN 20150201000000 AND 20150202235959) X").collect().foreach(println)
But getting the error
org.apache.spark.sql.AnalysisException:
Unsupported language features in query:
scala.NotImplementedError: No parse rules for ASTNode type: 882, text: TOK_WINDOWSPEC :
TOK_WINDOWSPEC 1, 90,98, 339
TOK_PARTITIONINGSPEC 1, 91,97, 339
TOK_ORDERBY 1, 91,97, 339
TOK_TABSORTCOLNAMEDESC 1, 95,97, 339
TOK_TABLE_OR_COL 1, 95,95, 339
CALL_DATETIME 1, 95,95, 339
" +
org.apache.spark.sql.hive.HiveQl$.nodeToExpr(HiveQl.scala:1261)
It looks like analytic function is not supported. I am using spark version 1.3.0; hive version 1.1.0 and hadoop version 2.7.0
Is there any other way this can be achieved from spark?
Upvotes: 1
Views: 1117
Reputation: 810
Window functions are supported as of Spark 1.4.0. There are still some limitations, e.g. ROWS BETWEEN is not yet supported. For an example have a look at this blog post on Spark window functions.
Upvotes: 1