Shamshad Alam
Shamshad Alam

Reputation: 1874

Unable to parse back SQL expression string generated by spark itself

I am running into a scenario where I need to convert spark expression to sql expression, and later need to parse sql expression back to spark expression. In most of the cases it work fine, but in some cases it throws error.

For example following works fine in spark

val sqlContext = spark.sqlContext
import sqlContext.implicits._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._

val df = Seq("Britain", "Germany", "USA", "Russia", "Japan").toDF("Country")

val japan = 'Country === "Japan"
df.filter(japan).show 
val sqlExp = japan.expr.sql
println(sqlExp) // output: (`Country` = 'Japan')
df.filter(expr(sqlExp)).show

But when I try same with following expression it fails:

val expression = 'Country.contains("S")
println(expression.expr.sql)
df.filter(expression).show
val parsedExpression = expr(expression.expr.sql) //output: contains(`Country`, 'S')
df.filter(parsedExpression).show

It seems like it works with only standard sql syntax. When I use expr("country LIKE '%S%'") it is able to parse.

Is there a way to parse back such an sql expression (that is generated by spark) to spark expression?

Upvotes: 0

Views: 149

Answers (1)

user10821012
user10821012

Reputation: 26

The Expression.sql method:

  • Is not a part of officially public API (as stated many times by the developers code in o.a.s.sql.catalyst should be considered "weakly" private).
  • Is not explictly intended to generate valid SQL string and can be even an arbitrary string./

    In fact contains(Country, 'S') is valid in neither sql (or spark-sql) nor expr.

Upvotes: 1

Related Questions