Using join prevents index use on where clause?

Question

Is there a way to use both an index for join and a separate index for a different where clause? Whenever I add the join query- it causes the where index to be ignored.

I have Users table (80,000 records)

id	company_id
1	4
2	4
3	5

with btree index on company_id: PRIMARY KEY, btree (id) Index btree on company_id

Comments table (7,000,000 records)

id	user_id
11	2
22	2
33	3

with btree index on user_id:

PRIMARY KEY, btree (id)
Index btree on user_id

If I use a simple where - it uses the company index:


    SELECT users.id
    FROM users
    WHERE users.company_id IN (16111, 16198, 16113, 16133, 16135, 16134,
      16136, 16137, 16138, 16139, 16141, 16142, 16143, 16144, 16145,
      16146, 16147, 16333)
    
> Index Scan using index_users_on_company_id on users  (cost=0.29..213.88 rows=97 width=4)
  Index Cond: (company_id = ANY ('{16111,16198,16113,16133,16135,16134,16136,16137,16138,16139,16141,16142,16143,16144,16145,16146,16147,16333}'::integer[]))

But when I add, although it uses user_id index at comments, it stopped using the company_id index (Parallel Seq Scan instead of previously used Index Scan):

SELECT users.id
FROM users INNER JOIN comments ON comments.user_id = users.id
WHERE users.company_id IN (16111, 16198, 16113, 16133, 16135, 16134, 16136, 16137, 16138, 16139, 16141, 16142, 16143, 16144, 16145, 16146, 16147, 16333)

Gather  (cost=1000.43..9632.88 rows=1410 width=4)
  Workers Planned: 1
  ->  Nested Loop  (cost=0.43..8491.88 rows=829 width=4)
        ->  Parallel Seq Scan on users  (cost=0.00..2742.06 rows=57 width=4)
              Filter: (company_id = ANY ('{16111,16198,16113,16133,16135,16134,16136,16137,16138,16139,16141,16142,16143,16144,16145,16146,16147,16333}'::integer[]))
        ->  Index Only Scan using index_comments_on_user_id on comments  (cost=0.43..98.21 rows=266 width=4)
              Index Cond: (user_id = users.id)

To summarize:

Is there a way to use both the company_id index and the user_id index in a single query?

Basically substitute the Parallel Seq Scan on users with Index Scan using index_users_on_company_id on users at the second query?

UPDATE:

After setting max_parallel_workers_per_gather to 0, as @Anton Grig suggested, The same query, uses index instead of Parallel Seq Scan It does performs faster. I wonder if there's a less intrusive way..

Nested Loop  (cost=0.72..9998.67 rows=1410 width=4)
  ->  Index Scan using index_users_on_company_id on users  (cost=0.29..213.88 rows=97 width=4)
        Index Cond: (company_id = ANY ('{16111,16198,16113,16133,16135,16134,16136,16137,16138,16139,16141,16142,16143,16144,16145,16146,16147,16333}'::integer[]))
  ->  Index Only Scan using index_comments_on_user_id on comments  (cost=0.43..98.21 rows=266 width=4)
        Index Cond: (user_id = users.id)

UPDATE2:

I'll add the full query (uses left outer join) I'm running, not a minified version for question

SELECT COUNT(DISTINCT comments.tag_id) FILTER (WHERE comments.archived = FALSE ) AS tags_count
FROM users LEFT OUTER JOIN comments ON comments.user_id = users.id
WHERE users.company_id IN (16111, 16198, 16113, 16133, 16135, 16134, 16136, 16137, 16138, 16139, 16141, 16142, 16143, 16144, 16145, 16146, 16147, 16333)
GROUP BY users.id

Using join prevents index use on where clause?

Answers (1)

Related Questions