Reputation: 20222
I am reading a DataFrame
from a CSV file like this:
val rawData = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "false")
.option("inferSchema", "true")
.load(url)
Then, I am attempting to filter it by the following criterion:
The first element of each row should be a string containing either AAA
or BBB
.
For this, I have the code:
val filteredData = rawData.filter(me => (me(0).toString.contains("AAA") || me(0).toString.contains("BBB")))
However, I am getting this error:
Error:(104, 41) missing parameter type
val filteredData = rawData.filter(me => (me(0).toString.contains("AAA") || me(0).toString.contains("BBB")))
What am I doing incorrectly?
Upvotes: 1
Views: 238
Reputation: 2804
You need to use filter in a different way. Try something like this:
val dataArray = Array(("AAA", 1), ("ABC", 2), ("ABCBBB", 3))
val rawData: DataFrame = sqlContext.createDataFrame(dataArray)
rawData.filter(rawData("_1").contains("AAA") || rawData("_1").contains("BBB")).show()
The result will be:
+------+---+
| _1| _2|
+------+---+
| AAA| 1|
|ABCBBB| 3|
+------+---+
Upvotes: 1