Reputation: 69
I need to pass various combination of column sets to my sql query as parameter
eg:
Val result=sqlContext.sql(""" select col1,col2,col3,col4,col5,count(col6) from table T1 GROUP BY col1,col2,col3,col4,col5 GROUPING SETS ((col1,col2),(col3,col4),(col4, col5)) """)
There are several combination for which I need to find the aggregated value. Is there any way to pass these sets of column as parameter to SQL query instead of hard coding it manually.
Currently I have provided all the combination in sql query but if any new combination comes then again I would need to change the query. I am planning to have all the combination In a file and then read all and pass as parameter to sql query. Is it possible?
Example: Table
id category age gender cust_id
1 101 54 M 1111
1 101 54 M 2222
1 101 55 M 3333
1 102 55 F 4444
""" select id, category, age, gender, count(cust_id) from table T1 group By id, category, age, gender
GROUPING SETS ((id,category),(id,age),(id,gender)) """
it should produce below result:
group by (id, category) - count of cust_id
1 101 3
1 102 1
group by (id and age) - count of cust_id
1 54 2
1 55 2
group by (id and gender) - count cust_id
1 M 3
1 F 1
this is just an example - I need to pass various different combination to GROPING SETS (not all combination) similarly as parameter in one go OR separately
Any help would be really appreciated.
Thanks a lot.
Upvotes: 0
Views: 1874
Reputation: 2020
You can build SQL dynamically
// original slices
var slices = List("(col1, col2)", "(col3, col4)", "(col4, col5)")
// adding new slice
slices = "(col1, col5)" :: slices
// building SQL dynamically
val q =
s"""
with t1 as
(select 1 col1, 2 col2, 3 col3,
4 col4, 5 col5, 6 col6)
select col1,col2,col3,col4,col5,count(col6)
from t1
group by col1,col2,col3,col4,col5
grouping sets ${slices.mkString("(", ",", ")")}
"""
// output
spark.sql(q).show
Result
scala> spark.sql(q).show
+----+----+----+----+----+-----------+
|col1|col2|col3|col4|col5|count(col6)|
+----+----+----+----+----+-----------+
| 1|null|null|null| 5| 1|
| 1| 2|null|null|null| 1|
|null|null| 3| 4|null| 1|
|null|null|null| 4| 5| 1|
+----+----+----+----+----+-----------+
Upvotes: 1
Reputation: 1
combination of column sets to my sql query as parameter
sql
is executed by Spark not source database. It won't reach MySQL at all.
I have provided all the combination
You don't need GROUPING SETS
if you want all possible combinations. Just use CUBE
:
SELECT ... FROM table CUBE (col1,col2,col3,col4,col5)
Upvotes: 0