GatesKennedy
GatesKennedy

Reputation: 725

PostgreSQL: COUNT the occurrence of unique array values nested within a JSONB column?

I'm using PostgreSQL 12.2

I have a table of songs that contain traits for filtering. The traits are stored in a jsonb column like..

INSERT INTO songs(title, traitsObj)
VALUES
('song1', '{ "type1":["trait3","trait7","trait2"], "type2":[<...moreTraits>] }'),
('song2', '{ "type1":["trait3","trait9","trait6"], "type2":[<...moreTraits>] }'),
('song3', '{ "type1":["trait4","trait3","trait5"], "type2":[<...moreTraits>] }')
;      

I need a query to return the count of how many times a unique trait exists in the array for each 'type' The result would be something like...

   trait   |  times_referenced
-------------------------------
  trait6  |  2355
  trait3  |  1493
  trait1  |  872
  trait5  |  724
  ...

I had a solution with union before restructuring the db but now have to update this query too. I'm hung up on un-nesting the array values into a temporary table to query but I was thinking something like...

WITH
   tTraits AS (
      SELECT <all values for 'trait1'> AS trait, title FROM songs
   )
SELECT trait, COUNT(*)
FROM tTraits
GROUP BY trait
ORDER BY times_referenced DESC;

Wouldn't be surprised if there's a cleaner way to do this without an intermediate table or WITH statement. Thank you!

Upvotes: 2

Views: 1248

Answers (2)

GatesKennedy
GatesKennedy

Reputation: 725

I found a solution with the help of @Barbaros_Ozhan (I know.. cant tag) mentioning jsonb_array_elements().

Here it is.

WITH tGenus
  AS (
    SELECT jsonb_array_elements(traitsobj->'genusType') AS species
    FROM tFamily
  )
SELECT species, COUNT(species) AS times_referenced
FROM tGenus
GROUP BY species
ORDER BY times_referenced DESC;

Results

        species        | times_referenced
-----------------------+------------------
 "Indie"               |             1260
 "Documentary"         |             1184
 "Drama"               |             1113
 ""                    |              773
 "Business"            |              758
 "Narrative"           |              651
 "LookBook Video"      |              648
 "Adventure"           |              278

Thank you Barbaros

Upvotes: 1

Barbaros &#214;zhan
Barbaros &#214;zhan

Reputation: 65278

You can distinctly list the song titles versus traits bu applying jsonb_array_elements() function, and then count them all :

SELECT traits, COUNT(traits) AS times_referenced
  FROM
   (
     SELECT DISTINCT title, j1.t AS traits
       FROM traits
      CROSS JOIN jsonb_array_elements(traitsObj->>"type1") AS j1(t)
      UNION ALL
     SELECT DISTINCT title, j2.t
       FROM traits
      CROSS JOIN jsonb_array_elements(traitsObj->>"type2") AS j2(t)
    )
 GROUP BY traits

Upvotes: 2

Related Questions