Reputation: 2181
I know there's a function called expr that turns your spark sql into a spark column with that expression:
>>> from pyspark.sql import functions as F
>>> F.expr("length(name)")
Column<b'length(name)'>
Is there a function that does the opposite - turn your Column into a pyspark's sql string? Something like:
>>> F.inverse_expr(F.length(F.col('name')))
'length(name)'
I found out that Column's __repr__
gives you an idea what the column expression is (like Column<b'length(name)'>
, but it doesn't seem to be usable programmatically, without some hacky parsing and string-replacing.
Upvotes: 1
Views: 551
Reputation: 595
I tried the accepted answer by @Som in Spark 2.4.2 and Spark 3.2.1 and it didn't work.
The following approach worked for me in pyspark:
import pyspark
from pyspark.sql import Column
def inverse_expr(c: Column) -> str:
"""Convert a column from `Column` type to an equivalent SQL column expression (string)"""
from packaging import version
sql_expression = c._jc.expr().sql()
if version.parse(pyspark.__version__) < version.parse('3.2.0'):
# prior to Spark 3.2.0 f.col('a.b') would be converted to `a.b` instead of the correct `a`.`b`
# this workaround is used to fix this issue
sql_expression = re.sub(
r'''(`[^"'\s]+\.[^"'\s]+?`)''',
lambda x: x.group(0).replace('.', '`.`'),
sql_expression,
flags=re.MULTILINE
)
return sql_expression
>>> from pyspark.sql import functions as F
>>> inverse_expr(F.length(F.col('name')))
'length(`name`)'
>>> inverse_expr(F.length(F.lit('name')))
"length('name')"
>>> inverse_expr(F.length(F.col('table.name')))
'length(`table`.`name`)'
Upvotes: 0
Reputation: 6323
In scala, we can use column#expr
to get sql type expression as below-
length($"entities").expr.toString()
// length('entities)
In pyspark-
print(F.length("name")._jc.expr.container)
# length(name)
Upvotes: 2