Raphael Roth
Raphael Roth

Reputation: 27383

How to filter a typed Spark Dataset using pattern matching

I try to jump on the typed Dataset API but I'm stuck with filtering:

val ds: Dataset[(Int, Int)] = Seq((1,1)).toDS

ds.filter(ij => ij._1 > ij._2) // does work, but is not readable
ds.filter{case (i,j) => i<j} // does not work

Error:(36, 14) missing parameter type for expanded function The argument types of an anonymous function must be fully known. (SLS 8.5) Expected type was: ?

I don't understand why pattern matching does not work with filter, while it works fine with map:

ds.map{case (i,j) => i+j}

Upvotes: 3

Views: 1358

Answers (3)

Raphael Roth
Raphael Roth

Reputation: 27383

Apparently it's a bug : https://issues.apache.org/jira/browse/SPARK-19492 Tanks to Bodgan for the information

Upvotes: 1

Bogdan Vakulenko
Bogdan Vakulenko

Reputation: 3390

This is a little bit more readable:

val ds: Dataset[(Int, Int)] = Seq((1,1)).toDS
ds.filter('_1 > '_2)

Note: you need to import spark.implicits._

Upvotes: 0

user8946228
user8946228

Reputation: 21

Make it explicit:

ds.filter{x => x match { case (i,j) => i < j}}

Upvotes: 2

Related Questions