Reputation: 643
I am trying to create a dashboard with query parameter in a query for adding dynamic column names. here is the query
SELECT :column_names FROM table1
result of the query looks like this:
array(col1, col2, col3)
["col1","col2","col3"]
I have tried exploding array
SELECT explode(:column_names) FROM table1
Result:
col
col1
col2
col3
I want to create a query like
SELECT col1, col2, col3 FROM table1
and the actual records from table in return. Is it possible to do it?
Upvotes: 0
Views: 64
Reputation: 3250
I have tried the below approach:
from pyspark.sql import SparkSession
from pyspark.sql.functions import col
spark = SparkSession.builder.appName("DynamicColumns").getOrCreate()
data = [(1, "a", "x"), (2, "b", "y"), (3, "c", "z")]
columns = ["col1", "col2", "col3"]
df = spark.createDataFrame(data, columns)
column_names = ["col1", "col2", "col3"]
selected_columns = [col(column) for column in column_names]
result_df = df.select(*selected_columns)
result_df.show()
Results:
+----+----+----+
|col1|col2|col3|
+----+----+----+
| 1| a| x|
| 2| b| y|
| 3| c| z|
+----+----+----+
In the above code I defined a list of column names to select dynamically.
By using list comprehension to create a list of col()
functions for each column name, I am able to dynamically select the desired columns from the DataFrame.
Next, by passing the list of selected columns using the *
operator to the select()
function, able to generate a query that selects the specified columns from the DataFrame.
Upvotes: 0