l0b0
l0b0

Reputation: 58808

How to select only *columns* with more than one distinct value?

SELECT DISTINCT … removes duplicate rows. Is there any way to remove duplicate columns, that is, columns where every value is the same? I'm looking through a legacy schema for the "distinguishing features" of some rows, so this would be super helpful. If it makes it any easier all of the columns are simple types like INT, TEXT and TIMESTAMP. Basically, given a table like this:

CREATE TEMPORARY TABLE column_test
(
    foo TEXT,
    bar INT,
    baz BOOLEAN,
    bat TIMESTAMP WITH TIME ZONE
);

INSERT INTO column_test (
    foo, bar, baz, bat
) VALUES
    ('lorem ipsum', 1, TRUE, '2000-01-01 UTC'),
    ('lorem ipsum', 2, TRUE, '2000-01-01 UTC'),
    ('other', 3, TRUE, '2000-01-01 UTC');

is it possible to write a query which will select only the foo and bar columns, since those are the only columns with more than one value? Something like this:

SELECT columns_with_more_than_one_value(*) FROM column_test;
foo           bar
'lorem ipsum' 1
'lorem ipsum' 2
'other'       3

I guess one way would be to SELECT *, transpose the result, remove any rows where all the values are the same, then transpose again, but that would be very complex and I don't know how I would keep the original column names in the result.

Upvotes: 0

Views: 230

Answers (1)

klin
klin

Reputation: 121604

Basically you cannot select unknown columns from a table. A query result must have a structure defined before it is executed. What you can do is to create a (temporary) view that contains expected columns. The function below does the job, extensively using dynamic SQL. The first argument of the function is a table name, the second - the name of the temporary view to be created.

create or replace function create_view_with_distinct_columns(text, text)
returns void language plpgsql as $$
declare
    col text;
    ct int;
    list text = '';
begin
    for col in 
        execute format('
            select attname
            from pg_attribute
            where attrelid = %s
            and attnum > 0',
            $1::regclass::oid)
    loop
        execute format('
            select count(distinct %I)
            from %I',
            col, $1)
        into ct;
        if ct > 1 then
            list:= format('%s%s,', list, col);
        end if;
    end loop;
    execute format('
        create temp view %I as
        select %s
        from %I',
        $2, left(list, -1), $1);
end $$;

Use:

select create_view_with_distinct_columns('column_test', 'column_view');

select * from column_view;

     foo     | bar
-------------+-----
 lorem ipsum |   1
 lorem ipsum |   2
 other       |   3
(3 rows)    

Db<>fiddle.

Upvotes: 1

Related Questions