Reputation: 1975
Using Sparks 2/java/Cassanda2.2 Trying to run a simple sparks sql query, it errors: Tried as below, + variations like "'LAX'", and '=' instead of '=='.
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '`LAX`' given input columns: [transdate, origin]; line 1 pos 42;
'Project ['origin]
+- 'Filter (origin#1 = 'LAX)
+- SubqueryAlias origins
+- LogicalRDD [transdate#0, origin#1]
JavaRDD<TransByDate> originDateRDD = javaFunctions(sc).cassandraTable("trans", "trans_by_date", CassandraJavaUtil.mapRowTo(TransByDate.class)).select(CassandraJavaUtil.column("origin"), CassandraJavaUtil.column("trans_date").as("transdate"));
long cnt1= originDateRDD.count();
System.out.println("sqlLike originDateRDD.count: "+cnt1); --> 406000
Dataset<Row> originDF = sparks.createDataFrame(originDateRDD, TransByDate.class);
originDF.createOrReplaceTempView("origins");
Dataset<Row> originlike = sparks.sql("SELECT origin FROM origins WHERE origin =="+ "LAX");
I have enabled Hive Support (if that helps) Thanks
Upvotes: 2
Views: 10220
Reputation: 15297
Put the column value inside single quote. Your query should look like below.
Dataset<Row> originlike = spark.sql("SELECT origin FROM origins WHERE origin == "+"'LAX'");
You can refer Querying Cassandra data using Spark SQL in Java for more detail.
Like query should be like below.
Dataset<Row> originlike = spark.sql("SELECT origin FROM origins WHERE origin like 'LA%'");
Upvotes: 2
Reputation: 3711
Hive is not the problem, here is the line that is your problem:
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '`LAX`' given input columns: [transdate, origin]; line 1 pos 42;
What this is saying is that among the column names, none are named LAX. The scala DSL asks for ===
when matching a value that is a key within a column, perahps something similar would be more ideal, something like a origins.filter($"origin === "LAX")
Upvotes: 0