Kurt Kline
Kurt Kline

Reputation: 2069

GROUP BY contents of json type column in Postgresql

I have a Postgresql table with a json column named "food".

Here is an example of some rows:

food
["cheese", "salmon", "eggs"]
["salmon", "cheese", "eggs"]
["broccoli", "ham", "milk"]
["salmon", "cheese", "eggs", "pizza"]

Current result:

food                                       count
["cheese", "salmon", "eggs"]              | 1
["salmon", "cheese", "eggs"]              | 1
["broccoli", "ham", "milk"]               | 1
["salmon", "cheese", "eggs", "pizza"]     | 1

Desired result:

food                                       count
["cheese", "salmon", "eggs"]              | 2
["broccoli", "ham", "milk"]               | 1
["salmon", "cheese", "eggs", "pizza"]     | 1

Is there a way to GROUP BY the contents of a json field without regard to the order of the elements? If two rows have the same contents, then I want them to be grouped together.

My plan was to GROUP BY json_array_elements(food), but for some reason this only returns the first element of each row.

Upvotes: 1

Views: 142

Answers (2)

Ancoron
Ancoron

Reputation: 2733

Actually similar to the answer of @Scoots, but no sorts, windows, aso:

SELECT (
    SELECT jsonb_agg(items order by items)
    FROM jsonb_array_elements(food) AS items
    ) AS food,
    count(*)
FROM test_json_grouping
GROUP BY 1;

...explained:

                                              QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=1635.60..1890.60 rows=200 width=40)
   Group Key: (SubPlan 1)
   ->  Seq Scan on test_json_grouping  (cost=0.00..1629.25 rows=1270 width=32)
         SubPlan 1
           ->  Aggregate  (cost=1.25..1.26 rows=1 width=32)
                 ->  Function Scan on jsonb_array_elements items  (cost=0.00..1.00 rows=100 width=32)
(6 rows)

Result:

                 food                  | count 
---------------------------------------+-------
 ["cheese", "eggs", "salmon"]          |     2
 ["broccoli", "ham", "milk"]           |     1
 ["cheese", "eggs", "pizza", "salmon"] |     1
(3 rows)

Upvotes: 1

Scoots
Scoots

Reputation: 3102

Not directly - to Postgres they're different strings.

However what we can do is unpack these json strings through json_array_elements, then repack them with our own sorting applied with json_agg. They're then homogenized for your group by to work.

Here's a query illustrating exactly what I mean:

select
    __food.food::text,
    count(1)
from (
    select
        json_agg(_unpack.food order by _unpack.food::text asc) as food
    from (
        select
            row_number() over(),
            json_array_elements(food) as food
        from
            YOUR_SCHEMA.YOUR_FOOD_TABLE
    ) as _unpack
    group by
        _unpack.row_number
) as __food
group by
    __food.food::text

Upvotes: 0

Related Questions