bsky
bsky

Reputation: 20222

Error when filtering DataFrame

I am reading a DataFrame from a CSV file like this:

val rawData = sqlContext.read
  .format("com.databricks.spark.csv")
  .option("header", "false")
  .option("inferSchema", "true")
  .load(url)

Then, I am attempting to filter it by the following criterion: The first element of each row should be a string containing either AAA or BBB. For this, I have the code:

val filteredData = rawData.filter(me => (me(0).toString.contains("AAA") || me(0).toString.contains("BBB")))

However, I am getting this error:

Error:(104, 41) missing parameter type
    val filteredData = rawData.filter(me => (me(0).toString.contains("AAA") || me(0).toString.contains("BBB")))

What am I doing incorrectly?

Upvotes: 1

Views: 238

Answers (1)

Carlos Vilchez
Carlos Vilchez

Reputation: 2804

You need to use filter in a different way. Try something like this:

val dataArray = Array(("AAA", 1), ("ABC", 2), ("ABCBBB", 3))
val rawData: DataFrame = sqlContext.createDataFrame(dataArray)

rawData.filter(rawData("_1").contains("AAA") || rawData("_1").contains("BBB")).show()

The result will be:

+------+---+
|    _1| _2|
+------+---+
|   AAA|  1|
|ABCBBB|  3|
+------+---+

Upvotes: 1

Related Questions