Reputation: 65550
I am working with Postgres 9.4. I have a JSONB field:
Column │ Type │ Modifiers
─────────────────┼──────────────────────┼────────────────────────────────────────────────────────────────────
id │ integer │ not null default
practice_id │ character varying(6) │ not null
date │ date │ not null
pct_id │ character varying(3) │
astro_pu_items │ double precision │ not null
astro_pu_cost │ double precision │ not null
star_pu │ jsonb │
I can query the raw values of the JSONB field just fine:
SELECT star_pu FROM mytable limit 1;
star_pu │ {"statins_cost": 16790.692924903742, "hypnotics_adq": 18523.58385328709, "laxatives_cost": 8456.98405165182, "analgesics_cost": 48271.21822239242, "oral_nsaids_cost": 9911.336052088493, "antidepressants_adq": 186715.7, "antidepressants_cost": 26885.54622478343, "bronchodilators_cost": 26646.54899847902, "cox-2_inhibitors_cost": 2063.4652015406728, "antiplatelet_drugs_cost": 4844.798321177439, "drugs_for_dementia_cost": 3390.569564110721, "antiepileptic_drugs_cost": 44990.94756286502, "oral_antibacterials_cost": 21047.048353859234, "oral_antibacterials_item": 5096.6501798218205, "ulcer_healing_drugs_cost": 15999.05326260261, "lipid-regulating_drugs_cost": 24711.589440943662, "proton_pump_inhibitors_cost": 14545.398978447573, "inhaled_corticosteroids_cost": 50759.91062192373, "calcium-channel_blockers_cost": 11571.457036131978, "omega-3_fatty_acid_compounds_adq": 2026.0, "benzodiazepine_caps_and_tabs_cost": 1800.2581325567717, "bisphosphonates_and_other_drugs_cost": 2996.912924744617, "drugs_acting_on_benzodiazepine_receptors_cost": 2993.142806352308, "drugs_affecting_the_renin_angiotensin_system_cost": 20255.500615282508, "drugs_used_in_parkinsonism_and_related_disorders_cost": 9812.457888596877}
Now I want SUM
the JSONB values across the entire table, but I don't know how to do this. Ideally I would get back a dictionary, where the keys are as above and the values are summed values.
I can do the following to SUM
one JSONB field explicitly:
SELECT date, SUM(total_list_size) as total_list_size,
SUM((star_pu->>'oral_antibacterials_item')::float) AS star_pu_oral_antibac_items
FROM mytable GROUP BY date ORDER BY date
But how do I calculate the sums for all the attributes in the JSONB field - and preferably return the entire field as a dictionary? Ideally I'd get back something like:
star_pu │ {"statins_cost": very-large-number, "hypnotics_adq": very-large-number, ...
I guess I can get each field manually by SUMming each key explicitly, but the whole reason I have the JSONB field is that there are lots of keys and they may change.
It is safe to assume that the JSONB field only contains keys and values, i.e. has depth 1.
Upvotes: 10
Views: 15515
Reputation: 121
Here is implementation of a stored procedure to sum JSONb values:
CREATE OR REPLACE FUNCTION sum_jsonb_values(data JSONB)
RETURNS NUMERIC AS $$
DECLARE
key_value RECORD;
total_sum NUMERIC := 0;
BEGIN
IF data IS NULL THEN
RETURN NULL;
END IF;
FOR key_value IN
SELECT * FROM jsonb_each(data)
LOOP
total_sum := total_sum + (key_value.value::NUMERIC);
END LOOP;
RETURN total_sum;
END;
$$ LANGUAGE plpgsql
IMMUTABLE ;
Upvotes: 0
Reputation: 4186
If you need to sum the values in a JSONB column in PostgreSQL, you can use a custom aggregate function. I've written a function called that does just that.
create function custom_jsonb_add(a jsonb, b jsonb) returns jsonb as $$
with expoded as (select *
from jsonb_each(a) as t(key, value)
union all
select *
from jsonb_each(b) as t(key, value)),
folded as (select key, sum(value::numeric) as value
from expoded
group by key)
select jsonb_object_agg(key, value)
from folded
$$ language SQL immutable strict;
create aggregate custom_jsonb_add_agg(jsonb) (
sfunc = custom_jsonb_add,
stype = jsonb,
initcond = '{}'
);
Upvotes: 1
Reputation: 3432
I've written a Postgres extension that does exactly that. Once you have it installed you could do:
SELECT jsonb_deep_sum(star_pu) FROM mytable;
Benchmarks are in 4s for 2 million rows, @klin's answer takes 11s
Upvotes: 6
Reputation: 121694
The query should do the job:
select date, json_object_agg(key, val)
from (
select date, key, sum(value::numeric) val
from mytable t, jsonb_each_text(star_pu)
group by date, key
) s
group by date;
The resulting json values will be sorted alphabetically by keys (a side effect of json_object_agg ()
). I do not know whether this matters.
Upvotes: 13
Reputation: 31173
There might be a better way, but at least this works:
WITH
keys AS (SELECT DISTINCT jsonb_object_keys(star_pu) AS key FROM mytable),
sums AS (SELECT key, sum((star_pu->>key)::float) AS total FROM keys, mytable GROUP BY key)
SELECT json_object(array_agg(key), array_agg(total::text))::jsonb FROM sums
Basically it explodes the jsonbs into rows, gets the names from them, sums them up, aggregates into arrays and creates a jsonb structure. Unfortunately there isn't a jsonb_object()
function so we have to make it into json and then cast to jsonb.
Upvotes: 1