How to get the count of duplicate value for all columns of a table

Question

I have a table like this,

dept_no | employee_id 
1       | 001
1       | 002
2       | 003
2       | 004

I want to get values like this:

field_name | count_of_distinct_value
dept_no    | 2
employee_id| 4

I know how to get the count of distinct value for a certain field, but don't know how to get for all columns at a time. How can I do that?

Abelisto · Accepted Answer

select key, count(distinct value)
from (select (jsonb_each(to_jsonb(t.*))).* from pg_class as t) as tt
group by key;

It is definitely not the most efficient solution but it is applicable for any table. Just replace pg_class by the desired table name.

PS: I got a lot of pain proposing this solution. Imagine the table with 100M rows and 100 columns. Then PostgreSQL should to build and sort the intermediate data with 10000000000 rows.

If you don't want the exact numbers but only evaluative then look at the pg_stats table at the n_distinct column in particular.

How to get the count of duplicate value for all columns of a table

Answers (2)

Related Questions