Christopher Causer
Christopher Causer

Reputation: 1494

Postgresql: Comparing two arrays but including the order

I wish to compare two arrays in a Postgres query, returning true when the first array is embedded within the second array. The smaller array can occur at any point within the larger one. It's probably best shown with an example. For the following, *cmp* is the magical operator that I'm hoping to find.

{b,c}   *cmp* {a,b,b,c,d} -- true
{b,d}   *cmp* {a,b,b,c,d} -- false
{a,b,b} *cmp* {a,b,b,c,d} -- true
{a,b}   *cmp* {a,b,b,c,d} -- true
{a,b,c} *cmp* {a,b,b,c,d} -- false

I know of the <@ operator, which is a good start, but does not take into account the order of elements.

   {b,d} <@ {a,b,b,c,d} -- true, but I want false

I have in my code a workaround which is quite ugly (perl's DBD::Pg uses '?' as a placeholder)

array_values::text similar to '%({|,)' || ? || '(,|})%'

Seems to work, but I'd love to be able to use an index here. It will also fall over whenever quotes are used in the text representation, but fortunately that won't happen for my use case. Am I missing a trick?

EDIT

I probably should have made better examples. Here are some more

{bb,c}   *cmp* {a,b,bb,c,d} -- true
{b,c}    *cmp* {a,b,bb,c,d} -- false
{a,b,bb} *cmp* {a,b,bb,c,d} -- true
{a,b,b}  *cmp* {a,b,bb,c,d} -- false
{c,d}    *cmp* {a,b,bb,c,d} -- true

Upvotes: 3

Views: 655

Answers (2)

Jeremy
Jeremy

Reputation: 6723

You can do this without comparing text versions. I'm not sure if the performance will be better though. Basically, check using the @> operator as a fast-fail (hopefully) and then look for the first item of the array in the test array. Grab a slice starting from that position and see if it's the same as the test array.

CREATE TABLE test (a text[]);
INSERT INTO test VALUES ('{bb,c}'), ('{b,c}'), ('{a,b,bb}'), ('{a,b,b}'), ('{c,d}');
SELECT a, 
       '{a, b, bb, c, d, b}' @> a AND (
         SELECT bool_or(
               ('{a, b, bb, c, d, b}'::text[])[x:(x+array_length(a,1) - 1)] = a
               ) 
         FROM unnest(array_positions('{a, b, bb, c, d, b}', a[1])) as pos(x)
       )
FROM test;
    a     | ?column?
----------+----------
 {bb,c}   | t
 {b,c}    | f
 {a,b,bb} | t
 {a,b,b}  | f
 {c,d}    | t

I added an extra 'b' to the test array so array_positions would return more than one result for the second test.

Upvotes: 2

jjanes
jjanes

Reputation: 44202

Use @> to get the index, then recheck for the order using your current method.

array_values::text similar to '%({|,)' || $1 || '(,|})%' and array_values @> ('{'||$1||'}')::text[]

How fast this will be will depend on how many rows have all the right values but not in the right order.

I use $1 rather than ? so that you don't have to specify the same parameter twice from your perl.

Upvotes: 0

Related Questions