Reputation: 7733
I have a table like as shown below
subject_id Desc Name class BC FU PA VI
1 Fung FU 0 1 0 0
1 Para PA 0 0 1 0
1 Viru VI 0 0 0 1
1 Para PA 0 0 1 0
1 T5 Bacte BC 1 0 0 0
1 T6 Bacte BC 1 0 0 0
2 T5 Bacte BC 1 0 0 0
2 Fung FU 1 0 0 0
What I would like to do is create a new column "BC_FU" which will have 1 if the subject has 1 in BC column and FU column. But it shouldn't be looking for in the same row but across all the records of the same subject.
The same logic applies for BC_VI column (another new column) as well
For ex: If you look at subject_id = 1, we can see that he has 1 in BC column at row 5 and 6 whereas he has FU = 1 at row 1. Now we know that subject_id has both BC and FU in this records (from database)
This is what I tried but doesn't help. I am trying to write in BigQuery, so your help to correct or fix this in BigQuery will be helpful
select *,
CASE WHEN (MAX(BC) == 1 AND MAX(FU) == 1) THEN 1
ELSE 0 END AS BC_FU,
CASE WHEN (MAX(BC) == 1 AND MAX(VI) == 1) THEN 1
ELSE 0 END AS BC_VI,
FROM TABLE T
GROUP BY SUBJECT_ID
So, I would like to create an output whick looks like below
subject_id Desc Name class BC FU PA VI BC_FU BC_VI
1 Fungi FU 0 1 0 0 1 1
1 Para PA 0 0 1 0 1 1
1 Virus VI 0 0 0 1 1 1
1 Para PA 0 0 1 0 1 1
1 T5 Bacte BC 1 0 0 0 1 1
1 T6 Bacte BC 1 0 0 0 1 1
2 T5 Bacte BC 1 0 0 0 1 1
2 Virus VI 0 1 0 1 1 1
Upvotes: 1
Views: 554
Reputation: 805
I've read your question and let me add my solution as well.
SELECT t1.*,
CASE WHEN (MAX(BC) OVER (PARTITION BY subject_id) +
MAX(FU) OVER (PARTITION BY subject_id) ) == 2
THEN 1
ELSE 0
END AS BC_FU
CASE WHEN (MAX(BC) OVER(PARTITION BY subject_id) +
MAX(VI) OVER(PARTITION BY subject_id) ) == 2
THEN 1
ELSE 0
END AS BC_VI
FROM table as t1
ORDER BY subject_id
Here are some tips which I thought might be useful for you:)
OVER () is sometimes referred to as window function. If you type SELECT (aggregating function) OVER(PARTITION BY columnA)
you can use the aggregation function but not turning the result into single row at the same time. (Please ignore this if you already know)
As Mr. Tim already mentioned, it's good to remove unnecessary sub-query (a query inside the other) to increase the readability.
Be sure to add 'Else 0' in every CASE sentence so as not to make the possibility of causing 'NULL'.
Here, I chose to sum up the maximum value of BC and FU to check if it's 2 or not,
rather than not checking 'intersection' (e.g. max(BC) ==1 AND max(FU) ==1) like Mr.Tim's post.
That is because I thought you would probably add such column as 'BC_FU_VI' in the future,
when '2' would increase readability that this case sentence is actually trying to convert two columns into a single column.
Thank you.
Upvotes: 1
Reputation: 5932
You can do the following
select t1.*
,max(BC) over(partition by subject_id)
*max(FU) over(partition by subject_id) as BC_FU
,max(BC) over(partition by subject_id)
*max(VI) over(partition by subject_id) as BC_VI
from your_table t1
Upvotes: 2
Reputation: 522074
If I understand your requirement correctly, you should just be able to use analytic functions here:
SELECT *,
CASE WHEN MAX(BC) OVER (PARTITION BY subject_id) +
MAX(FU) OVER (PARTITION BY subject_id) = 2 THEN 1 ELSE 0 END AS BC_FU,
CASE WHEN MAX(BC) OVER (PARTITION BY subject_id) +
MAX(VI) OVER (PARTITION BY subject_id) = 2 THEN 1 ELSE 0 END AS BC_VI
FROM yourTable
ORDER BY subject_id;
This answer avoids the need for an unnecessary subquery.
Upvotes: 1
Reputation: 204854
select t1.*, tmp.BC_FU, tmp.BC_VI
from your_table t1
join
(
select subject_id,
CASE WHEN MAX(BC) + MAX(FU) = 2 THEN 1 ELSE 0 END AS BC_FU,
CASE WHEN MAX(BC) + MAX(VI) = 2 THEN 1 ELSE 0 END AS BC_VI
from your_table
group by subject_id
) tmp on t1.subject_id = tmp.subject_id
Upvotes: 1