SheppardDigital
SheppardDigital

Reputation: 3265

Using GROUP BY to sum the values of columns in rows

I have a table that has the following fields;

id (int)
line_id (int)
cell_no (int)
timestamp (datetime)
total (int)
passed (int)
data_type (varchar)
created_at (datetime)
updated_at (datetime)

With the table, there are multiple rows which contain a cell_no of the same value. I want to query the table, merge all rows with the same cell_no value, sum the total of passed and total for each distinct cell_no, and then return the results sorted by passed / total in ascending order.

I can get this working fine in MySQL, but I can't get it working in MSSQL as it complains that some fields are not in the group by definition, but when I put them in the group by, the rows where the cell_no is the same are not merged.

Here's how I have the query in MySQL;

SELECT id, cell_no, data_type, sum(total) AS total, sum(passed) AS passed
FROM line_data AS ld
WHERE line_id = 1
      AND timestamp >= '2018-07-29 00:00:00'
      AND timestamp <= '2018-07-29 23:00:00'
GROUP BY cell_no
ORDER BY ((passed / total) * 100) ASC

How would I present this to MSSQL?

Upvotes: 0

Views: 518

Answers (3)

Igor Quirino
Igor Quirino

Reputation: 1195

You have two mistakes:

  • You can only select fields aggregated or specified on group by.
  • You have to aggregate/sum the totals on order by.

You have three options to fix them.

  1. Remove id column from select

    SELECT cell_no, SUM(total) AS total, SUM(passed) AS passed
    FROM line_data AS ld
    WHERE line_id = 1
      AND timestamp >= '2018-07-29 00:00:00'
      AND timestamp <= '2018-07-29 23:00:00'
    GROUP BY cell_no
    ORDER BY ((SUM(passed)/ SUM(total)) * 100) ASC
    
  2. Add id column to your group by:

    SELECT 
        id, cell_no, data_type, 
        SUM(total) AS total, SUM(passed) AS passed
    FROM line_data AS ld
    WHERE line_id = 1
      AND timestamp >= '2018-07-29 00:00:00'
      AND timestamp <= '2018-07-29 23:00:00'
    GROUP BY id, cell_no, data_type
    ORDER BY ((SUM(passed) / SUM(total)) * 100) ASC
    
  3. Aggregate id column (with max, min or ....)

    SELECT 
        MAX(id), cell_no, data_type, 
        SUM(total) AS total, SUM(passed) AS passed
    FROM line_data AS ld
    WHERE line_id = 1
      AND timestamp >= '2018-07-29 00:00:00'
      AND timestamp <= '2018-07-29 23:00:00'
    GROUP BY cell_no, data_type
    ORDER BY ((SUM(passed)/ SUM(total)) * 100) ASC
    

Upvotes: 1

Thorsten Kettner
Thorsten Kettner

Reputation: 95101

Your problem is that your query is invalid SQL. Per cell_no there exist multiple id and data_type and you don't tell the DBMS which to pick.

MySQL silently converts your select clause to

SELECT
  ANY_VALUE(id),
  cell_no,
  ANY_VALUE(data_type),
  SUM(total) AS total,
  SUM(passed) AS passed

The function ANY_VALUE(id) means you don't care which of the cell_no's IDs to pick for the results.

ANY_VALUE is not available in SQL Server, but as you obviously don't care which value, you can replace it by MIN or MAX:

SELECT
  MIN(id),
  cell_no,
  MIN(data_type),
  SUM(total) AS total,
  SUM(passed) AS passed

UPDATE: Another issue may be your ORDER BY clause. 7 / 2 is 3.5 in MySQL (as should be expected), but only 3 in SQL Server, because SQL Server sees a division of two integers and wants the result to be an integer, too. You can convert one or both integers to decimal to get a decimal result:

ORDER BY (CAST(passed AS DECIMAL) / total) * 100 ASC

Upvotes: 2

Wyatt Shipman
Wyatt Shipman

Reputation: 1789

All fields in the Select statement that aren't being aggregated need to be in the Group by.

SELECT id, cell_no, data_type, sum(total) AS total, sum(passed) AS passed, (sum(passed)/sum(total)*100) as percent_passed FROM line_data AS ld WHERE line_id = 1 AND timestamp >= '2018-07-29 00:00:00' AND timestamp <= '2018-07-29 23:00:00' GROUP BY id, cell_no, data_type ORDER BY 6 ASC

Edited to include the percent passed in the select statement, and then order by 6 to order by column 6.

Upvotes: 0

Related Questions