Reputation: 617
I have a simple use case. I have to use a wildcard as a value in LIKE condition.
I am trying to filter out records from a string column that contains _A_
.
Its a simple LIKE
statement use case. But since _
in _A_
is a wild card, a LIKE
would throw wrong result.
In SQL we can use ESCAPE to achieve this. How can i achieve this in Spark?
I have not tried Regular expression. Wanted to know if there is any other simpler workaround
I am using Spark 1.5 with Scala.
Thanks in advance!
Upvotes: 0
Views: 2830
Reputation: 31490
You can use .contains
(or) like
(or) rlike
functions for this case and use \\
to escape _
in like
val df=Seq(("apo_A_"),("asda"),("aAc")).toDF("str")
//using like
df.filter(col("str").like("%\\_A\\_%")).show()
//using rlike
df.filter(col("str").rlike(".*_A_.*")).show()
//using contains
df.filter(col("str").contains("_A_")).show()
//+------+
//| str|
//+------+
//|apo_A_|
//+------+
Upvotes: 1
Reputation: 4045
If you can use Spark with Dataframe code would be as simple as
object EscapeChar {
def main(args: Array[String]): Unit = {
val spark = Constant.getSparkSess
import spark.implicits._
val df = List("_A_","A").toDF()
df.printSchema()
df.filter($"value".contains("_A_")).show()
}
}
Upvotes: 1