Reputation: 23
I have a simple table of the form:
id | gender | a_feature (bool) | b_feature (bool) | ... | xyz_feature (bool) |
---|
and I want to sum over all feature columns dependent on gender.
metric | male | female |
---|---|---|
a_feature | 345 | 3423 |
b_feature | 65 | 143 |
... | ... | ... |
xyz_feature | 133 | 5536 |
Is there a simple way to do this, e.g. using the information_schema.
I found only the solution below, but this is very ugly:
select
'a_feature' as feature_name,
count(case a_feature and gender = 'male') as male,
count(case a_feature and gender = 'female') as female
from table
union
select
b_feature as feature_name,
count(case b_feature and gender = 'male') as male,
count(case b_feature and gender = 'female') as female
from table
.
.
.
select
xyz_feature as feature_name,
count(case xyz_feature and gender = 'male') as male,
count(case xyz_feature and gender = 'female') as female
from table
Upvotes: 1
Views: 2008
Reputation: 3
When you use this in Postgres, what do you mean by t (I think t is for table) in "from t cross join lateral" and what do you mean by v in "v(name, feature)" ?
select name,
sum(case when feature and gender = 'male' then 1 else 0 end) as num_male,
sum(case when feature and gender = 'female' then 1 else 0 end) as num_female
from t cross join lateral
(values ('a_feature', a_feature),
('b_feature', b_feature),
. . .
) v(name, feature)
group by name;
Upvotes: 0
Reputation: 1269803
You can unpivot and aggregate. One method is:
select name,
sum(case when feature and gender = 'male' then 1 else 0 end) as num_male,
sum(case when feature and gender = 'female' then 1 else 0 end) as num_female
from ((select 'a_feature' as name, a_feature as feature, gender
from t
) union all
(select 'b_feature' as name, b_feature, gender
from t
) union all
. . .
) f
group by name;
In Postgres, you would unpivot using a lateral join:
select name,
sum(case when feature and gender = 'male' then 1 else 0 end) as num_male,
sum(case when feature and gender = 'female' then 1 else 0 end) as num_female
from t cross join lateral
(values ('a_feature', a_feature),
('b_feature', b_feature),
. . .
) v(name, feature)
group by name;
You can generate the list for values()
using information_schema.columns
if you are reluctant to type it all in.
EDIT:
You can construct the values
clause using something like this:
select string_agg('(''' || column_name || ''', column_name)', ', ')
from information_schema.columns
where table_name = ?
Upvotes: 2