user1330974
user1330974

Reputation: 2616

How to roll up based on a few criteria in SQL

I have a data table like this:

QuestionID    UserName    UserWeightingForQuestion    AnswerGivenForQuestion    Metric
1             A           1.50                        1                         ToBeCalculated
1             B           1.00                        2                         ToBeCalculated
1             C           1.80                        3                         ToBeCalculated
1             D           1.20                        1                         ToBeCalculated
1             E           1.40                        2                         ToBeCalculated
2             A           1.20                        2                         ToBeCalculated
2             B           1.20                        2                         ToBeCalculated
2             C           1.10                        4                         ToBeCalculated
2             D           1.20                        5                         ToBeCalculated
...

For each question group, I'd like to fill each cell under Metric column with a calculated value defined as shown below:

Metric_For_User_A_For_QuestionID_X = SUM(Weights_With_The_Answer_Similar_To_What_Is_Given_By_User_A_In_QuestionID_Group = X) / DISTINCT(All_WEeights_In_One_QuestionID_Group = X)

Specifically speaking,

Metric_For_User_A_For_QuestionID_1 = SUM(1.50+1.20)/(1.50+1.00+1.80+1.20+1.40)
Metric_For_User_B_For_QuestionID_1 = SUM(1.00+1.40)/(1.50+1.00+1.80+1.20+1.40)
Metric_For_User_C_For_QuestionID_1 = SUM(1.80)/(1.50+1.00+1.80+1.20+1.40)
Metric_For_User_D_For_QuestionID_1 = SUM(1.50+1.20)/(1.50+1.00+1.80+1.20+1.40)
Metric_For_User_E_For_QuestionID_1 = SUM(1.00+1.40)/(1.50+1.00+1.80+1.20+1.40)

For QuestionID group = 2, I'd like to repeat the process as above. For example,

Metric_For_User_A_For_QuestionID_2 = SUM(1.20+1.20)/(1.20+1.10)

I'm fairly new to SQL and I believe the OVER or some sort of aggregation function can be utilized to achieve this(?) If this kind of calculation is possible in SQL, could someone with SQL expertise suggest me a way to achieve what I'm trying to calculate.

The raw table has ~70m rows, and I am using SQL Server. Thank you very much in advance for your suggestions and answers!

Upvotes: 0

Views: 659

Answers (2)

McNets
McNets

Reputation: 10807

declare @quest table(QuestionID int
                     , UserName varchar(20)
                     , UserWeightingForQuestion decimal(10,2)
                     , AnswerGivenForQuestion int);
insert into @quest values
(1,'A',1.50,1),(1,'B',1.00,2),(1,'C',1.80,3),(1,'D',1.20,1),
(1,'E',1.40,2),(2,'A',1.20,2),(2,'B',1.20,2),(2,'C',1.10,4),(2,'D',1.20,5);

Baicaly you made two partitions, one by QuestionID and AnswerGivenForQuestion, and another by QuestionID.

WITH CALC AS
(
    SELECT Q2.QuestionID, Q2.UserName, 
           SUM(UserWeightingForQuestion) OVER (PARTITION BY QuestionID, AnswerGivenForQuestion) AS Weight,
           (SELECT SUM(DISTINCT Q1.UserWeightingForQuestion)
            FROM @quest Q1
            WHERE Q1.QuestionID = Q2.QuestionID) AS AllWeights
    FROM @quest Q2
)
SELECT QuestionID, UserName, Weight, AllWeights, 
       CAST(Weight / AllWeights AS DECIMAL(18,2)) as Metric
FROM CALC
ORDER BY QuestionID, UserName;

+------------+----------+--------+------------+--------+
| QuestionID | UserName | Weight | AllWeights | Metric |
+------------+----------+--------+------------+--------+
|      1     |     A    |  2,70  |    6,90    |  0,39  |
|      1     |     B    |  2,40  |    6,90    |  0,35  |
|      1     |     C    |  1,80  |    6,90    |  0,26  |
|      1     |     D    |  2,70  |    6,90    |  0,39  |
|      1     |     E    |  2,40  |    6,90    |  0,35  |
+------------+----------+--------+------------+--------+
|      2     |     A    |  2,40  |    2,30    |  1,04  |
|      2     |     B    |  2,40  |    2,30    |  1,04  |
|      2     |     C    |  1,10  |    2,30    |  0,48  |
|      2     |     D    |  1,20  |    2,30    |  0,52  |
+------------+----------+--------+------------+--------+

Upvotes: 1

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49260

You can use the SUM window function to do this.

select t.*,
sum(UserWeightingForQuestion) over(partition by questionID,AnswerGivenForQuestion)
/sum(UserWeightingForQuestion) over(partition by questionID) as metric
from tablename t
  • sum(UserWeightingForQuestion) over(partition by questionID) gets the sum of all UserWeightingForQuestion per questionID

  • sum(UserWeightingForQuestion) over(partition by questionID,AnswerGivenForQuestion) sums up the similar UserWeightingForQuestion per questionID

Edit: To sum up the distinct weights for each questionID in the denominator, use

select t.*,
sum(UserWeightingForQuestion) over(partition by questionID,AnswerGivenForQuestion)
/(select sum(distinct UserWeightingForQuestion) from tablename where t.questionID=questionID) as metric
from tablename t

Upvotes: 1

Related Questions