Reputation: 367
Sorry if this is a duplicate, however, the solutions pointed out aren't working for me. Most likely I am missing something basic here. I have a Dataframe like below:
inputDF: org.apache.spark.sql.DataFrame = [ts: string, id: string ... 20 more fields]
I am trying to filter some of the "rows" of interest here based on a field called "state" (of type String) by doing this (in Scala):
inputDF.filter(inputDF("state") == "BALANCED").show()
However this gives me an error:
<console>:143: error: overloaded method value filter with alternatives:
(func: org.apache.spark.api.java.function.FilterFunction[org.apache.spark.sql.Row])org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] <and>
(func: org.apache.spark.sql.Row => Boolean)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] <and>
(conditionExpr: String)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] <and>
(condition: org.apache.spark.sql.Column)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
cannot be applied to (Boolean)
inputDF.filter(inputDF("connState") == "BALANCED").show()
Can someone please point out what is incorrect here ? I followed few examples including the one in https://rklicksolutions.wordpress.com/2016/03/03/tutorial-spark-1-6-sql-and-dataframe-operations/ but can't figure out whats wrong.
Upvotes: 1
Views: 3453
Reputation: 367
Looks like I need to use === instead of ==
inputDF.filter(inputDF("state") === "BALANCED").show()
is doing what I want.
Upvotes: 2