Randomize
Randomize

Reputation: 9103

How to workaround unsupported percentile_cont in Postgres/Citus?

I have a query similar to this:

select
      coalesce(s.import_date, r.import_date) as import_date,
      coalesce(s.bedrooms, r.bedrooms) as bedrooms,
      coalesce(s.ptype, r.ptype) as property_type,
      s.s_price,
      s.s_transactions,
      ....
      r.r_rent,
      ....
from
(
    select

       sc.import_date,
       sc.bedrooms,
       sc.ptype,

       percentile_cont(array[0.25,0.5,0.75,0.9]) within group (order by sc.asking_price) filter(where sc.price > 0) as s_price,

       sum(1) filter(where sc.sold_price > 0) as s_transactions,

       ......

       from prices sc
       where sc.ptype = 'F' and  sc.bedrooms = 2 and st_Intersects('010300002.....'::geometry,sc.geom)
       and sc.import_date between '2012-01-01' and '2019-01-01'
       group by sc.import_date, sc.bedrooms, sc.property_type
) s
full join
(
    select
       rc.import_date,
       rc.bedrooms,
       rc.ptype,

       percentile_cont(array[0.25,0.5,0.75,0.9]) within group (order by rc.rent) filter(where rc.rent > 0) as r_rent,

      .....

       from rents rc
       where rc.ptype = 'F' and  rc.bedrooms = 2 and st_Intersects('010300002....'::geometry,rc.geom)
       and rc.import_date between '2012-01-01' and '2019-01-01'
       group by rc.import_date, rc.bedrooms, rc.property_type
) r
on r.import_date = s.import_date;

When I run it against my distributed tables on Citus/Postgres-11 I get:

ERROR: unsupported aggregate function percentile_cont

Is there any way to workaround this limitation?

Upvotes: 0

Views: 245

Answers (1)

Hanefi
Hanefi

Reputation: 114

AFAIK there is no easy workaround for this.

You can always pull all the data to the coordinator and calculate percentiles there. It is not advisable to do this in the same query though.

SELECT percentile_cont(array[0.25,0.5,0.75,0.9]) within group (order by r.order_col)
FROM
(
    SELECT order_col, ...
    FROM rents
    WHERE ...
) r
GROUP BY ...

This query will pull all the data returned by the inner subquery to the coordinator and calculate the percentiles in the coordinator.

Upvotes: 1

Related Questions