Neeraj Rana
Neeraj Rana

Reputation: 43

mysql - sum of json elements, group by and select all

I have the following table structure and data:

DumpTime ProcId NodeId Stats
2022-07-07 13:03:03 9 NODE_IDD02 {"First":1,"Second":5,"Third":2}
2022-07-07 13:17:35 8 NODE_IDD03 {"First":2,"Second":3,"Third":4}
2022-07-07 13:22:35 9 NODE_IDD02 {"First":0,"Second":3,"Third":0}
2022-07-07 13:03:03 0 NODE_IDD01 {"First":0,"Second":0,"Third":5}
2022-07-07 13:17:35 8 NODE_IDD03 {"First":4,"Second":1,"Third":1}
2022-07-07 13:22:35 0 NODE_IDD01 {"First":3,"Second":3,"Third":3}

I have to group the rows using ProcId and NodeId and add the respective values of the JSON elements in Stats (The DumpTime has no significance).

The output should be:

DumpTime ProcId NodeId Stats
2022-07-07 13:03:03 9 NODE_IDD02 {"First":1,"Second":8,"Third":2}
2022-07-07 13:17:35 8 NODE_IDD03 {"First":6,"Second":4,"Third":5}
2022-07-07 13:03:03 0 NODE_IDD01 {"First":3,"Second":3,"Third":8}

I am able to get the sum of respective json elements but it is the total sum and not grouped. I used sum(json_extract()) for it.

Also, output has to be in the specified form so that it can be unmarshaled into a struct. What should be the sql query to achieve the same?

Upvotes: 1

Views: 400

Answers (1)

Alexey
Alexey

Reputation: 2479

To get what you need you can use a query like this

SELECT 
    MIN(data.dump_time) AS DumpTime,
    data.proc_id AS ProcId,
    data.node_id AS NodeId,
    JSON_OBJECTAGG(q.attr, q.attr_total) AS Stats
FROM (
    SELECT 
        proc_id,
        attr,
        CAST(SUM(JSON_EXTRACT(stats, CONCAT('$.', attr))) AS SIGNED) AS attr_total 
    FROM  data
    CROSS JOIN JSON_TABLE(CAST(JSON_KEYS(stats) AS JSON), '$[*]' COLUMNS(attr VARCHAR(20) PATH '$')) t
    GROUP BY proc_id, attr
) q
JOIN data ON q.proc_id = data.proc_id
GROUP BY data.proc_id, data.node_id

You can check a working demo here


If all stats values have the same attributes ("First", "Second", "Third") then you can make it a little simplier like this

SELECT 
    MIN(data.dump_time) AS DumpTime,
    data.proc_id AS ProcId,
    data.node_id AS NodeId,
    JSON_OBJECTAGG(q.attr, q.attr_total) AS Stats
FROM (
    SELECT 
        proc_id,
        attr,
        CAST(SUM(JSON_EXTRACT(stats, CONCAT('$.', attr))) AS SIGNED)  AS attr_total 
    FROM  data
    CROSS JOIN (VALUES ROW("First"), ROW("Second"), ROW("Third")) t(attr)
    GROUP BY proc_id, attr
) q
JOIN data ON q.proc_id = data.proc_id
GROUP BY data.proc_id, data.node_id

You can check it here

Upvotes: 1

Related Questions