user1487861
user1487861

Reputation: 472

Position of the lowest value greater than x in ordered postgresql array (optimization)

Looking at the postgres function array_position(anyarray, anyelement [, int])

My problem is similar, but I'm looking for the position of the first value in an array that is greater than an element. I'm running this on small arrays, but really large tables.

This works:

CREATE OR REPLACE FUNCTION arr_pos_min(anyarray,real)
  RETURNS int LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'select array_position($1,(SELECT min(i) FROM unnest($1) i where i>$2))';

the array_position takes advantage of the fact that my array is ordered, but the second part doesn't. And I feel like the second part could potentially just return the position without having to re-query.

My arrays are only 100 elements long, but I have to run this millions of times and so looking for a performance pickup.

Suggestions appreciated.

Upvotes: 0

Views: 335

Answers (2)

user330315
user330315

Reputation:

This seems to be a bit faster

CREATE OR REPLACE FUNCTION arr_pos_min(p_input anyarray, p_to_check real)
  RETURNS int 
AS
$$
  select t.idx
  from unnest(p_input) with ordinality as t(i, idx) 
  where t.i > p_to_check
  order by t.idx
  limit 1
$$
LANGUAGE sql 
IMMUTABLE 
PARALLEL SAFE 
;

The above will use the fact that the values in the array are already sorted. Sorting by the array index is therefor quite fast. I am not sure if unnest() is guaranteed in this context to return the elements in the order they are stored in the array. If that was the case, you could remove the order by and make it even faster.

Upvotes: 1

Laurenz Albe
Laurenz Albe

Reputation: 246848

I don't think that there is a more efficient solution than yours, except if you write a dedicated C function for that.

Storing large arrays is often a good recipe for bad performance.

Upvotes: 0

Related Questions