Reputation: 1
I am building a Streamlit app to translate SQL queries into PySpark. The app fails when the SQL query includes functions or complex operations in the JOIN condition. How can I correctly handle these scenarios?
Example:
result:
5. tbl1 = spark.table("tbl1")
6. tbl2 = spark.table("tbl2")
7. tbl2_join_df = tbl1.join(tbl2, tbl1.col1 == tbl2.col1, "inner")
which is wrong earlier join operation is structured to check the column name appending with "=".`
Upvotes: 0
Views: 38
Reputation: 2468
You can use Abstract Syntax Tree of the sql expression and then extract whatever expression you are interested in.
https://github.com/tobymao/sqlglot/blob/main/posts/ast_primer.md
https://medium.com/@pabbelt/why-you-should-use-sqlglot-to-manage-your-sql-codebase-82d841c0d450
Upvotes: 0