Anuj Mahajan
Anuj Mahajan

Reputation: 57

Using loop to create spark SQL queries

I am trying to create some spark SQL queries for different tables which i have collected as a list. I want to create SQL queries for all the tables present in the hive database.The hive context has been initialized Following is my approach.

tables= spark.sql("select tables in survey_db")

# registering dataframe as temp view with 2 columns - tableName and db name
tables.createOrReplaceTempView("table_list") 

# collecting my table names in a list
table_array= spark.sql("select collect_list(tableName) from table_list").collect()[0][0] 

# array values(list)
table_array= [u'survey',u'market',u'customer'] 

I want to create spark SQL queries for the table names stored in table_array. for example:

for i in table_array:
   spark.sql("select * from survey_db.'i'")

I cant use shell scripting as i have to write a pyspark script for this. Please advice if spark.sql queries can be created using loop/map . Thanks everyone.

Upvotes: 2

Views: 6273

Answers (1)

Felix K Jose
Felix K Jose

Reputation: 882

You can achieve the same as follows:

sql_list = [f"select * from survey_db.{table}" for table in table_array]
for sql in sql_list:
    df = spark.sql(sql)
    df.show()

Upvotes: 2

Related Questions