Ziqi Liu
Ziqi Liu

Reputation: 3171

postgres can we use array operation to perform IN or ANY operator?

Let's say we have a simple table

CREATE TABLE tag (
  "name" VARCHAR(255) NOT NULL,
  PRIMARY KEY (name)
);

and now I want to find those tags whose name in an array ['tag1', 'tag2'], I can do this in a traditional SQL IN or ANY operator, or I can use array operation in postgres,

approach1: array contain operator

select * from tag where ARRAY[name] <@ '{"tag1","tag2"}';

approach2: array overlap operator

select * from tag where ARRAY[name] && '{"tag1","tag2"}';

approach3: ANY operator

select * from tag where name = ANY('{"tag1","tag2"}');

approach4: IN operator

select * from tag where name IN ('tag1', 'tag2');

They all work in postgres, I'm just wondering is there any performance concern here? Or the postgres optimizer is smart enough to always find the best way to do it?

Upvotes: 0

Views: 347

Answers (1)

Blue Star
Blue Star

Reputation: 1952

I encourage you to explore questions like these on your own using the EXPLAIN keyword to see what postgres is doing behind the scenes. Let's compare the EXPLAIN plans of two of these queries:

postgres=# explain select * from tag where ARRAY[name] && '{"tag1","tag2"}';
                          QUERY PLAN
---------------------------------------------------------------
 Seq Scan on tag  (cost=0.00..11.88 rows=1 width=516)
   Filter: (ARRAY[name] && '{tag1,tag2}'::character varying[])
(2 rows)


postgres=# explain select * from tag where name = ANY('{"tag1","tag2"}');
                              QUERY PLAN
-----------------------------------------------------------------------
 Bitmap Heap Scan on tag  (cost=4.31..9.65 rows=2 width=516)
   Recheck Cond: ((name)::text = ANY ('{tag1,tag2}'::text[]))
   ->  Bitmap Index Scan on tag_pkey  (cost=0.00..4.30 rows=2 width=0)
         Index Cond: ((name)::text = ANY ('{tag1,tag2}'::text[]))
(4 rows)

As you can see, the queries with ARRAY[name] are not able to using the index on the name column, because postgres isn't acting on the column directly, but rather on an array created from the column. However, the IN and ANY queries have the same plan, and make use of your index, so you should use one of those.

Upvotes: 2

Related Questions