ma11hew28
ma11hew28

Reputation: 126457

PostgreSQL: Inner join on a specific value

  1. Is it better (mainly performance-wise) to do:

    SELECT u.email, s.id
    FROM users s, sessions s
    WHERE u.id = s.user_id
    AND u.id = 1
    

    or:

    SELECT u.email, s.id
    FROM users s, sessions s
    WHERE u.id = 1
    AND s.user_id = 1
    
  2. What about:

    DELETE FROM users u
    USING sessions s
    WHERE u.id = s.user_id
    AND u.id = 1
    AND s.id = 2
    RETURNING TRUE;
    

    or:

    DELETE FROM users u
    USING sessions s
    WHERE u.id = 1
    AND s.user_id = 1
    AND s.id = 2
    RETURNING TRUE;
    

Table Schema:

CREATE EXTENSION "uuid-ossp";

CREATE TABLE users (
    id bigserial PRIMARY KEY,
    email varchar(254) NOT NULL,
    created_at timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE UNIQUE INDEX on users (lower(email));

CREATE TABLE sessions (
    user_id bigint PRIMARY KEY REFERENCES users ON DELETE CASCADE,
    id uuid NOT NULL DEFAULT uuid_generate_v4(),
    created_at timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
);

Upvotes: 1

Views: 725

Answers (1)

khampson
khampson

Reputation: 15316

Whenever you specify a particular value instead of a potential range of values, the query that has the constant value should be faster (although with an index, it might not be hugely so).

You can do an EXPLAIN ANALYZE on both to see the estimates vs. actual.

I believe since the first part of the query is looking to see if the two columns are equal, it will need to compare against them all, and then further filter by the second part of the query.

If you know both values in advance, I think not only is it likely to perform better, but it also reads better in code to specify both explicitly.

Edit: Note: For a small amount of data, the query plans appear to pretty much be the same:

Nested Loop (cost=0.30..16.34 rows=1 width=532) -> Index Scan using users_pkey on users u (cost=0.14..8.16 rows=1 width=524) Index Cond: (id = 1) -> Index Scan using sessions_pkey on sessions s (cost=0.15..8.17 rows=1 width=24) Index Cond: (user_id = 1)

Nested Loop (cost=0.30..16.34 rows=1 width=532) -> Index Scan using users_pkey on users u (cost=0.14..8.16 rows=1 width=516) Index Cond: (id = 1) -> Index Scan using sessions_pkey on sessions s (cost=0.15..8.17 rows=1 width=16) Index Cond: (user_id = 1)

SQL Fiddle: http://sqlfiddle.com/#!15/c9ac7/4

However, you should try it with your dataset to confirm that the query planner chooses the same plans in both cases with your data.

Upvotes: 2

Related Questions