Whitewater
Whitewater

Reputation: 317

Adding elements from a list to spark.sql() statement

My list:

pylist = ['A', 'B', 'C', 'D']

My spark.sql() line:

query = spark.sql(
    """
    SELECT col1, col2, col3
    FROM database.table
    WHERE col3 IN ('A', 'B', 'C', 'D')
    """
)

I want to replace the list of elements in the spark.sql() statement with the Python list so that the last line in the SQL is:

...
AND col3 IN pylist

I am aware of {} and str.format but I don't understand if that's the correct option and how that works.

Upvotes: 4

Views: 3497

Answers (4)

Vinodh
Vinodh

Reputation: 1

If it's one element in the list use a conditional statement to fix the DELETE query's WHERE clause:

if len(filesToDelete) > 1:
    s = """
    DELETE from table_name WHERE s3_file_name in {}
    """.format(tuple(filesToDelete))
    print(s)
elif len(filesToDelete) == 1:
    s = """ DELETE from table_name WHERE s3_file_name = '{}'
    """.format(filesToDelete[0])
    print(s)
else:
    print("Log No Files to Delete")

Upvotes: 0

Whitewater
Whitewater

Reputation: 317

I think the solution is .format(tuple(pylist)):

pylist = ['A', 'B', 'C', 'D']

s = """
    SELECT col1, col2, col3
    FROM database.table
    WHERE col3 IN {}
    """.format(tuple(pylist))

query = spark.sql(s)

Upvotes: 4

Climbs_lika_Spyder
Climbs_lika_Spyder

Reputation: 6754

You can also use an f-string with {tuple(pylist)} as shown:

The statement now is:

pylist = ['A', 'B', 'C', 'D']

s = f"""
    SELECT col1, col2, col3
    FROM database.table
    WHERE col3 IN {tuple(pylist)}
    """

query = spark.sql(s)

Upvotes: 0

omuthu
omuthu

Reputation: 6333

Convert to tuple to get "()" instead of "[]" while converting to string

sql_str="SELECT col1, col2, col3 FROM database.table WHERE col3 IN " + str(tuple(pylist))

query = spark.sql(sql_str)

Upvotes: 0

Related Questions