How to fix PostgreSQL generic plan estimate for any($1) array parameter?

I've run into an issue where PostgreSQL (13.1 on windows, default config) would start preferring a generic plan after 5 executions because the plan estimate is clearly wrong. Thing is, the custom plan is up to 10 times faster than the generic plan, and at least 2.5 times faster in the "worst" case. I don't want to use set plan_cache_mode = force_custom_plan, because the query is part of a software run in parallel with connection pooling over several threads, as well as dozens of smaller and larger queries, and it would feel like doing a hacky workaround that might also be dangerous for other queries.

I think I've tracked it down to the generic plan estimate being wrong for the x.id = ANY($1) parts in the where clause which I'Ve got three of.

Query goes like this:

SELECT ... FROM ... WHERE ... x.id = ANY($1) AND y.id = ANY($2) AND z.id = ANY($3) ...

of course, this is hugely simplified, but I know that I pass an int8[] of 50 entries for the first parameter 99% of the time (because the query is for a paged view of records, where one page has 50 records), which is correctly estimated as 50 rows for the custom query plan, but 10 rows for the generic query plan.

Switching to using IN($1, $2, ... $50) fixes this, but we're currently trying to move over our IN clauses to ANY, because it's more efficient over JDBC and we've been bitten by the parameter limit a few times with IN (which doesn't happen with ANY, as then it's only one parameter). Also, this would make the count of parameters variable, so the query planner would get a different query quite often (only $1 is pretty much always an array of 50 values on our prod system, the others may have less or more depending on lots of factors).

So far I've tried, without much success:

casting things to try to hint the planner
create statistics on various involved tables
increase statistics target on involved columns
using unnest (this fixes the estimate, but then the query is about 20 times slower in any case, so not really a solution for me) and a few dozen other "hacky" workarounds like using regex split to table and the like.

Note that I do know why it uses a generic plan after the 5th execution - because the estimate of that is lower than the average estimate of the 5 previous custom plans. But I don't know how to fix that wrong estimate.

Here's the relevant rows of the custom plan:

->  Bitmap Heap Scan on sbuilding b  (cost=150.51..332.59 rows=50 width=29) (actual time=0.077..0.187 rows=50 loops=2)
    Recheck Cond: (id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50}'::bigint[]))
    Filter: (deletedat IS NULL)
    Heap Blocks: exact=50
    ->  Bitmap Index Scan on sbuilding_pkey  (cost=0.00..150.50 rows=50 width=0) (actual time=0.065..0.066 rows=50 loops=2)
        Index Cond: (id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50}'::bigint[]))

And the same part for the generic plan:

->  Index Scan using sbuilding_pkey on sbuilding b  (cost=0.28..82.84 rows=10 width=29) (actual time=0.049..0.229 rows=50 loops=2)
    Index Cond: (id = ANY ($2))
    Filter: (deletedat IS NULL)

I've replaced actual real ids with numbers 1-50, but the real ids all exist, as they are retrieved from the db in a step before this one.

I'd prefer not having to post the full query or query plans, as those contain a lot of sensitive information (would be hundreds of words and values to replace for me), but I hope these parts are enough to show the issue and hopefully one of you will be able to assist.

Edit: I don't have any array extension like intArray installed explicitly - found a mailing list thread regarding estimates with intArray extension, so I thought I'd add this. Not sure though if it still applies? Here's the link: https://www.postgresql.org/message-id/20150317191531.GE10492%40momjian.us

Thanks a lot!

Edit2: I just retested with a super simple query and it seems that the query planner always assumes 10 values for = ANY(:arrayParam) for generic plans. This seems to be the culprit in my case. Is there any way to "fix" that assumption? Preferable without installing any extensions that need config changes like pg_hint_plan. Thx!

Edit3: found the following in postgres source code selfuncs.c:

/*
         * Arbitrarily assume 10 elements in the eventual array value (see
         * also estimate_array_length).  We don't risk an assumption of
         * disjoint probabilities here.
         */
        for (i = 0; i < 10; i++)
        {
            if (useOr)
                s1 = s1 + s2 - s1 * s2;
            else
                s1 = s1 * s2;
        }

/*
 * Estimate number of elements in the array yielded by an expression.
 *
 * It's important that this agree with scalararraysel.
 */
int
estimate_array_length(Node *arrayexpr)
{
    /* look through any binary-compatible relabeling of arrayexpr */
    arrayexpr = strip_array_coercion(arrayexpr);

    if (arrayexpr && IsA(arrayexpr, Const))
    {
        Datum       arraydatum = ((Const *) arrayexpr)->constvalue;
        bool        arrayisnull = ((Const *) arrayexpr)->constisnull;
        ArrayType  *arrayval;

        if (arrayisnull)
            return 0;
        arrayval = DatumGetArrayTypeP(arraydatum);
        return ArrayGetNItems(ARR_NDIM(arrayval), ARR_DIMS(arrayval));
    }
    else if (arrayexpr && IsA(arrayexpr, ArrayExpr) &&
             !((ArrayExpr *) arrayexpr)->multidims)
    {
        return list_length(((ArrayExpr *) arrayexpr)->elements);
    }
    else
    {
        /* default guess --- see also scalararraysel */
        return 10;
    }
}

I'll see where I can get from there, but I still hope someone has a solution that does not involve me going through the whole postgres source code ;)

Upvotes: 4

Answers (2)

Peter

Reputation: 1206

If you do not want to go for the solution of setting plan_cache_mode that was suggested by Laurenz Albe, I have this suggestion:

Add an additional condition as a literal to your prepared statement that

always evaluates to true and
changes each time.

Epoch with milliseconds could be used for instance:

SELECT ... FROM ... WHERE ... x.id = ANY($1) AND y.id = ANY($2) AND z.id = ANY($3) ... AND 1678193956123=1678193956123

SELECT ... FROM ... WHERE ... x.id = ANY($1) AND y.id = ANY($2) AND z.id = ANY($3) ... AND 1678193974456=1678193974456

SELECT ... FROM ... WHERE ... x.id = ANY($1) AND y.id = ANY($2) AND z.id = ANY($3) ... AND 1678193987789=1678193987789

Or you could use a hash of your input params etc.

Upvotes: 0

Laurenz Albe

Reputation: 247790

That can be done by changing plan_cache_mode for this query:

BEGIN;
SET LOCAL plan_cache_mode = force_custom_plan;
SELECT /* your query */
COMMIT;

That will make the optimizer always use a custom plan.

Upvotes: 2

How to fix PostgreSQL generic plan estimate for any($1) array parameter?

Answers (2)

Related Questions