pcv
pcv

Reputation: 2181

Is there an inverse function for pyspark's expr?

I know there's a function called expr that turns your spark sql into a spark column with that expression:

>>> from pyspark.sql import functions as F 
>>> F.expr("length(name)")
Column<b'length(name)'>

Is there a function that does the opposite - turn your Column into a pyspark's sql string? Something like:

>>> F.inverse_expr(F.length(F.col('name')))
'length(name)'

I found out that Column's __repr__ gives you an idea what the column expression is (like Column<b'length(name)'>, but it doesn't seem to be usable programmatically, without some hacky parsing and string-replacing.

Upvotes: 1

Views: 551

Answers (2)

Zohar Meir
Zohar Meir

Reputation: 595

I tried the accepted answer by @Som in Spark 2.4.2 and Spark 3.2.1 and it didn't work.
The following approach worked for me in pyspark:

import pyspark
from pyspark.sql import Column

def inverse_expr(c: Column) -> str:
    """Convert a column from `Column` type to an equivalent SQL column expression (string)"""
    from packaging import version
    sql_expression = c._jc.expr().sql()
    if version.parse(pyspark.__version__) < version.parse('3.2.0'):
        # prior to Spark 3.2.0 f.col('a.b') would be converted to `a.b` instead of the correct `a`.`b`
        # this workaround is used to fix this issue
        sql_expression = re.sub(
            r'''(`[^"'\s]+\.[^"'\s]+?`)''',
            lambda x: x.group(0).replace('.', '`.`'),
            sql_expression,
            flags=re.MULTILINE
        )
    return sql_expression
>>> from pyspark.sql import functions as F
>>> inverse_expr(F.length(F.col('name')))
'length(`name`)'
>>> inverse_expr(F.length(F.lit('name')))
"length('name')"
>>> inverse_expr(F.length(F.col('table.name')))
'length(`table`.`name`)'

Upvotes: 0

Som
Som

Reputation: 6323

In scala, we can use column#expr to get sql type expression as below-

length($"entities").expr.toString()
// length('entities)

In pyspark-

print(F.length("name")._jc.expr.container)
# length(name)

Upvotes: 2

Related Questions