Reputation: 132
I'm storing AdWords report data in Postgres. Each report is stored in a table named Reports, which has a jsonb column named 'data'. Each report has json stored in its 'data' field that looks that looks like this:
[
{
match_type: "exact",
search_query: "gm hubcaps",
conversions: 2,
cost: 1.24
},
{
match_type: "broad",
search_query: "gm auto parts",
conversions: 34,
cost: 21.33
},
{
match_type: "phrase",
search_query: "silverdo headlights",
conversions: 63,
cost: 244.05
}
]
What I want to do is query off these data hashes and sum up the total number of conversions for a given report. I've looked though the Postgresql docs and it looks like you can only really do calculations on hashes, not arrays of hashes like this. Is what I'm trying to do possible in postgres? Do I need to make a temp table out of this array and do calculations off that? Or can I use a stored procedure?
I'm using Postgresql 9.4
EDIT The reason I'm not just using a regular, normalized table is that this is just one example of how report data could be structured. In my project, reports have to allow arbitrary keys, because they are populated by users uploading CSV's with any columns they like. It's basically just a way to get around having arbitrarily many, user-created tables.
Upvotes: 2
Views: 1785
Reputation: 657202
What I want to do is query off these data hashes and sum up the conversions
The fastest way should be with jsonb_populate_recordset()
. But you need a registered row type for it.
CREATE TEMP TABLE report_data (
-- match_type text -- commented out, because we only need ..
-- , search_query text -- .. conversions for this query
conversions int
-- , cost numeric
);
A temp table is one way to register a row type ad-hoc. More explanation in this related answer:
Assuming a table report
with report_id
as PK for lack of inforamtion.
SELECT r.report_id, sum(d.conversions) AS sum_conversions
FROM report r
LEFT JOIN LATERAL jsonb_populate_recordset(null::report_data, r.data) d ON true
-- WHERE r.report_id = 12345 -- only for given report?
GROUP BY 1;
The LEFT JOIN
ensures you get a result, even if data
is NULL or empty or the JSON array is empty.
For a sum from a single row in the underlying table, this is faster:
SELECT d.sum_conversions
FROM report r
LEFT JOIN LATERAL (
SELECT sum(conversions) AS sum_conversions
FROM jsonb_populate_recordset(null::report_data, r.data)
) d ON true
WHERE r.report_id = 12345; -- enter report_id here
Alternative with jsonb_array_elements()
(no need for a registered row type):
SELECT d.sum_conversions
FROM report r
LEFT JOIN LATERAL (
SELECT sum((value->>'conversions')::int) AS sum_conversions
FROM jsonb_array_elements(r.data)
) d ON true
WHERE r.report_id = 12345; -- enter report_id here
Normally you would implement this as plain, normalized table. I don't see the benefit of JSON here (except that your application seems to require it, like you added).
Upvotes: 1
Reputation: 2638
You could use unnest
:
select sum(conv) from
(select d->'conversion' as conv from
(select unnest(data) as d from <your table>) all_data
) all_conv
Disclaimer: I don't have Pg 9.2 so I couldn't test it myself.
EDIT: this is assuming that the array you mentioned is a Postgresql array, i.e. that the data type of your data
column is character varying[]
. If you mean the data
is a json array, you should be able to use json_array_elements
instead of unnest
.
Upvotes: 1