Gokul Anbazhagan
Gokul Anbazhagan

Reputation: 1

Methods to find/check objects representing SQL expressions like functions or complex operations in SQLGLOT OR PYSPARK

I am building a Streamlit app to translate SQL queries into PySpark. The app fails when the SQL query includes functions or complex operations in the JOIN condition. How can I correctly handle these scenarios?

Example:

  1. SELECT tbl1.col1, tbl2.col2
  2. FROM tbl1
  3. JOIN tbl2
  4. ON tbl1.col1 = UPPER(tbl2.col3);

result:
5. tbl1 = spark.table("tbl1")
6. tbl2 = spark.table("tbl2")
7. tbl2_join_df = tbl1.join(tbl2, tbl1.col1 == tbl2.col1, "inner")

which is wrong earlier join operation is structured to check the column name appending with "=".`

Upvotes: 0

Views: 38

Answers (1)

user238607
user238607

Reputation: 2468

You can use Abstract Syntax Tree of the sql expression and then extract whatever expression you are interested in.

https://github.com/tobymao/sqlglot/blob/main/posts/ast_primer.md

https://medium.com/@pabbelt/why-you-should-use-sqlglot-to-manage-your-sql-codebase-82d841c0d450

Upvotes: 0

Related Questions