learning fun
learning fun

Reputation: 97

Can't use SUM in SQL when the value is not integer

I am using below SQL in SNOWFLAKE and couldn't able to get the count of the instance. The end goal that i am trying to achieve is that i want to know that count of teams and instances that have the same the same Issue.

My SQL query:

SELECT 
    priority,
    QID,
    issue,
    solution,
    team,
    instance,
    COUNT(team) AS TEAM_COUNT,
    SUM(instance) AS Instance_count
FROM  
    ABCDTABLE
WHERE 
    priority != 'NONE'
GROUP_BY
    priority, QID, issue, solution, team, instance
ORDER_BY
    priority, QID, issue, solution, team, instance

I get this error:

Numeric value 'i-21232' is not recognized

Sample data:

Priority    QID           Issue                SOLUTION       TEAM    INSTANCE
HIGH        123   Wrong package version     Update Package.   Devops.  i-21232
HIGH        123   wrong package version     update package.   infra.   i-12341
MEDIUM      721.  SSH vulnerability.      Remove old version.  App     i-323232

Expected Output:

Priority  QID.           Issue              Solution      count(TEAM) COUNT(INSTANCE)
HIGH.     123.   Wrong Package Version.   Update package.       2                2

Thank You

Upvotes: 1

Views: 1203

Answers (4)

Simeon Pilgrim
Simeon Pilgrim

Reputation: 25968

So if we take the pasted SQL and data. then fixed the GROUP BY and ORDER BY and change SUM(instance) to COUNT(instance)

WITH ABCDTABLE(Priority, QID, Issue, SOLUTION,TEAM, INSTANCE) AS (
    SELECT * FROM VALUES
    ('HIGH',  123, 'Wrong package version', 'Update Package.', 'Devops.', 'i-21232'),
    ('HIGH',  123, 'wrong package version', 'update package.', 'infra.', 'i-12341'),
    ('MEDIUM',721, 'SSH vulnerability.', 'Remove old version.', 'App', 'i-323232')
)
SELECT 
    priority,
    QID,
    issue,
    solution,
    team,
    instance,
    COUNT(team) AS TEAM_COUNT,
    COUNT(instance) AS Instance_count
FROM  
    ABCDTABLE
WHERE 
    priority != 'NONE'
GROUP BY
    priority, QID, issue, solution, team, instance
ORDER BY
    priority, QID, issue, solution, team, instance

we get some perfectly valid SQL, which returns:

PRIORITY QID ISSUE SOLUTION TEAM INSTANCE TEAM_COUNT INSTANCE_COUNT
HIGH 123 Wrong package version Update Package. Devops. i-21232 1 1
HIGH 123 wrong package version update package. infra. i-12341 1 1
MEDIUM 721 SSH vulnerability. Remove old version. App i-323232 1 1

which is not what you want.

So like everybody else, removing TEAM & INSTANCE from the selection and groupings would give better results.

But you input data is clearly inconsistent. so to lowering will improve that, but then our group by should be swapped to the output column references 1,2,3,4

WITH ABCDTABLE(Priority, QID, Issue, SOLUTION,TEAM, INSTANCE) AS (
    SELECT * FROM VALUES
    ('HIGH',  123, 'Wrong package version', 'Update Package.', 'Devops.', 'i-21232'),
    ('HIGH',  123, 'wrong package version', 'Update Package.', 'infra.', 'i-12341'),
    ('MEDIUM',721, 'SSH vulnerability.', 'Remove old version.', 'App', 'i-323232')
)
SELECT 
    lower(priority) as priority,
    QID,
    lower(issue) as issue,
    lower(solution) as solution,
    COUNT(team) AS TEAM_COUNT,
    COUNT(instance) AS Instance_count
FROM  ABCDTABLE
WHERE priority != 'NONE'
GROUP BY 1,2,3,4
ORDER BY 1,2,3,4

giving:

PRIORITY QID ISSUE SOLUTION TEAM_COUNT INSTANCE_COUNT
high 123 wrong package version update package. 2 2
medium 721 ssh vulnerability. remove old version. 1 1

And then as Chris notes, this has the row with a count of one. So you ether made a typo in your example data.

which can be achieved by adding a HAVING TEAM_COUNT > 1 after the GROUP BY clause. You can reuse a already named output value like I have, or count all rows like Chris notes with count(*)

Then there is one point you make in a comment:

the reason i want to use SUM for the instance is that i want to get the total count of the instances that are having the same issue. If i use count on instance, it doesn't show total count.

I translate that into you want the distinct count, which if we boost the data to:

SELECT * FROM VALUES
    ('HIGH',  123, 'Wrong package version', 'Update Package.', 'Devops.', 'i-21232'),
    ('HIGH',  123, 'wrong package version', 'update package.', 'infra.', 'i-12341'),
    ('HIGH',  123, 'wrong package version', 'update package.', 'other', 'i-12341'),
    ('MEDIUM',721, 'SSH vulnerability.', 'Remove old version.', 'App', 'i-323232')

I read that as your want 2, not 3, thus COUNT(DISTINCT(instance)) can be used.

WITH ABCDTABLE(Priority, QID, Issue, SOLUTION,TEAM, INSTANCE) AS (
    SELECT * FROM VALUES
    ('HIGH',  123, 'Wrong package version', 'Update Package.', 'Devops.', 'i-21232'),
    ('HIGH',  123, 'wrong package version', 'update package.', 'infra.', 'i-12341'),
    ('HIGH',  123, 'wrong package version', 'update package.', 'other', 'i-12341'),
    ('MEDIUM',721, 'SSH vulnerability.', 'Remove old version.', 'App', 'i-323232')
)
SELECT 
    lower(priority) as priority,
    QID,
    lower(issue) as issue,
    lower(solution) as solution,
    COUNT(team) AS TEAM_COUNT,
    COUNT(distinct instance) AS Instance_count
FROM  ABCDTABLE
WHERE priority != 'NONE'
GROUP BY 1,2,3,4
HAVING TEAM_COUNT > 1
ORDER BY 1,2,3,4

gives:

PRIORITY QID ISSUE SOLUTION TEAM_COUNT INSTANCE_COUNT
high 123 wrong package version update package. 3 2

Upvotes: 1

Cristian Scutaru
Cristian Scutaru

Reputation: 1507

You have the following problems:

  1. INSTANCE contains non-numeric values, so you cannot use it with SUM. It should be COUNT as well.
  2. You can either use a field as it is, or with an aggregate in the SELECT projection. Never both. So you cannot have both team and COUNT(team), or instance and COUNT(instance), in the SELECT clause. Remove team and instance, if you use COUNT(team) and COUNT(instance).
  3. Remove from GROUP BY fields you used as aggregates in SELECT. I mean team and instance.
  4. If you used aggregates in SELECT, your ORDER BY clause can use the same aggregates, not the fields alone. For instance, you can sort by COUNT(team) or COUNT(instance), not by team or instance.
  5. (later edit) It should be also GROUP BY and ORDER BY, with no underscore.
  6. (later edit) Your grouping must use a case insensitive issue and solution fields.
  7. (later edit) Added HAVING count(*) > 1, because you want groups with at least two rows.

Here is a possible correct version of your query:

SELECT 
    priority,
    QID,
    lower(issue),
    lower(solution),
    COUNT(team) AS TEAM_COUNT,
    COUNT(instance) AS Instance_count
FROM  
    ABCDTABLE
WHERE 
    priority != 'NONE'
GROUP BY
    1, 2, 3, 4
HAVING
    count(*) > 1
ORDER BY
    1, 2, 3, 4;

db-fiddle

Upvotes: 0

L.Newell
L.Newell

Reputation: 112

You may need to remove the TEAM and INSTANCE columns from the GROUP BY, and change the SUM(INSTANCE) to a COUNT(INSTANCE).

As stated above, you can't SUM data the contains non-numerical values.

E.g. does the below work for you?

SELECT 
    priority,
    QID,
    issue,
    solution,
    team,
    instance,
    COUNT(team) AS TEAM_COUNT,
    COUNT(instance) AS Instance_count
FROM  
    ABCDTABLE
WHERE 
    priority != 'NONE'
GROUP BY
    priority, QID, issue, solution
ORDER BY
    priority, QID, issue, solution, team, instance

Upvotes: 1

This query will give the desired result

SELECT 
    priority,
    QID,
    issue,
    solution,
    COUNT(team) AS TEAM_COUNT,
    COUNT(instance) AS Instance_count
FROM  
    ABCDTABLE
WHERE 
    priority != 'NONE'
GROUP_BY
    priority, QID, issue, solution
ORDER_BY
    priority, QID, issue, solution

Upvotes: 0

Related Questions