Padfoot123
Padfoot123

Reputation: 1117

Passing variables to hive query in pyspark sql

I am trying to execute a query against hive table using spark sql.

The below works fine

spark=SparkSession.builder.master("local[1]".enableHiveSupport().appName("test").getOrCreate()
df=spark.sql("select * from table_name where date='2021-05-16' and name='xxxx'")

But I want to pass date and name as a variable and not hardcode it into SQL.

Is there a way to pass date=current_date instead of hardcoding the value

I am trying to pass current date as date to query using time.strftime and name I have to pass it from another variable name='xxxx'

Upvotes: 0

Views: 682

Answers (1)

Junhua.xie
Junhua.xie

Reputation: 174

do you can to pass the variables from outside of the py file?

if it is , you can try this

import sys
day = sys.argv[1]
df=spark.sql("select * from table_name where date='%s'" % day)
spark-submit --master yarn test.py 2021-09-17

Upvotes: 1

Related Questions