Reputation: 573
I have a Table lets say :-
Name A B C D
------- --- --- --- ---
alpha 0 1 0 0.6
beta 0.6 0 0 0.1
gama 0 0 0 0.6
Now I want to populate values on Two columns(Result & Class)
based on A, B, C, D values.
The condition is if value in any of the field(A,B,C,D) is >.5 then Result
column should have "F" else it should have "P". Also the column whose valie is >.5 should be in Class
example("A,D")
For better understanding here is the result I want:-
Name A B C D Result Class
------- --- --- --- --- -------- -------
alpha 0 1 0 0.6 F B,D
beta 0.6 0 0 0.1 F A
gama 0 0 0 0.4 P NULL
I am New to BigQuery and need Help. What would be workaround.
This what I have done till yet
SELECT *, CASE WHEN (A > .5 OR B > .5 OR C > .5 OR D >.5)
THEN 'F'
ELSE 'P' END AS Result AND Class....//here i am stuck
FROM table1
Actually, I have no Idea how to Build this exact Script. I was able to achieve first part where I was able to Populate Result column with "F" and "P" but could not make Class to populate column names....
Upvotes: 1
Views: 1379
Reputation: 172993
Below is for BigQuery Standard SQL
Using javaScript UDF helps in many cases but should be avoid if problem can be solved with SQL as in below example
#standardSQL
SELECT *,
( SELECT IF(LOGICAL_OR(val > 0.5), 'F', 'P')
FROM UNNEST([A,B,C,D]) val
) AS Result,
( SELECT STRING_AGG(['A','B','C','D'][OFFSET(pos)])
FROM UNNEST([A,B,C,D]) val WITH OFFSET pos
WHERE val > 0.5
) AS Class
FROM `project.dataset.table`
You can test , play with above using sample data from y our question as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'alpha' name, 0 A, 1 B, 0 C, 0.6 D UNION ALL
SELECT 'beta', 0.6, 0, 0, 0.1 UNION ALL
SELECT 'gamma', 0, 0, 0, 0.4
)
SELECT *,
( SELECT IF(LOGICAL_OR(val > 0.5), 'F', 'P')
FROM UNNEST([A,B,C,D]) val
) AS Result,
( SELECT STRING_AGG(['A','B','C','D'][OFFSET(pos)])
FROM UNNEST([A,B,C,D]) val WITH OFFSET pos
WHERE val > 0.5
) AS Class
FROM `project.dataset.table`
with output as
Row name A B C D Result Class
1 alpha 0.0 1 0 0.6 F B,D
2 beta 0.6 0 0 0.1 F A
3 gamma 0.0 0 0 0.4 P null
Upvotes: 2
Reputation: 4042
Since you are analysing each column, I assume you do not have a extensive quantity of columns. Therefore, I created a simple JavaScript User Defined Function (UDF) in order to check the row's value and return the column's name if the condition is met.
I have used the provided sample data to test the below query.
#javaScript UDF
CREATE TEMP FUNCTION class(A FLOAT64, B FLOAT64, C FLOAT64, D FLOAT64)
RETURNS String
LANGUAGE js AS """
var class_array=[];
if(A > 0.5){class_array.push("A");}
if(B > 0.5){class_array.push("B");}
if(C > 0.5){class_array.push("C");}
if(D > 0.5){class_array.push("D");}
return class_array;
""";
#sample data
WITH data as (
SELECT "alpha" as Name, 0 as A, 1 as B, 0 as C, 0.6 as D UNION ALL
SELECT "beta", 0.6, 0, 0, 0.1 UNION ALL
SELECT "gama", 0, 0, 0, 0.4
)
Select name, A,B,C,D,
CASE WHEN (A > .5 OR B > .5 OR C > .5 OR D >.5) THEN "F" ELSE "P" END AS Result,
IF(class(A,B,C,D) is null , null, class(A,B,C,D)) as Class from data
And the output,
Row name A B C D Result Class
1 alpha 0 1 0 0.6 F B,D
2 beta 0.6 0 0 0.1 F A
3 gama 0 0 0 0.4 P
As it is shown within the UDF, each row's value is analysed and if the condition is met, the column's name is manually added to an array of strings. In addition, pay attention that the JS UDF returns a String, not an array. It automatically converts the previously created Array to String.
Lastly, I should point that is not possible to retrieve the column name within a query in this context. Although, you can retrieve it, in other scenarios, using INFORMATION_SCHEMA.
Upvotes: 2