Reputation: 11
Hi I am new to spark and I am trying to run a sql query using spark
df.filter("gene" === "abcd" && "biomarkerName".contains("72fqss") && "tagType" === "pname").select("biomarkerId").distinct().show()
Its throws error value === is not a member of String
I tried using == in place of === and it throws the boolean error
error: overloaded method value filter with alternatives:
(func: org.apache.spark.api.java.function.FilterFunction[org.apache.spark.sql.Row])org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] <and>
(func: org.apache.spark.sql.Row => Boolean)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] <and>
(conditionExpr: String)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] <and>
(condition: org.apache.spark.sql.Column)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
cannot be applied to (Boolean)
and also I tried to val spark: SparkSession = ... import spark.implicits._
but I am not sure what are those ... for and when I try to import spark.implicits._ nothing happened how to approach this problem
Upvotes: 0
Views: 829
Reputation: 814
You are getting this error because you are not telling spark that the left side of the condition is a column.
You could do something like this without implicits :
df.filter(col("gene") === "abcd" && col("biomarkerName").contains("72fqss") && col("tagType") === "pname").select("biomarkerId").distinct().show()
Or with implicits :
import spark.implicits._
df.filter($"gene" === "abcd" && $"biomarkerName".contains("72fqss") && $"tagType" === "pname").select("biomarkerId").distinct().show()
Upvotes: 1