Reputation: 22958
In PostgreSQL 9.5 I have a table with 67000 records:
# \d words_nouns
Table "public.words_nouns"
Column | Type | Modifiers
---------+--------------------------+-----------
word | text | not null
hashed | text | not null
added | timestamp with time zone |
removed | timestamp with time zone |
Indexes:
"words_nouns_pkey" PRIMARY KEY, btree (word)
Check constraints:
"words_nouns_word_check" CHECK (word ~ '^[A-Z]{2,}$'::text)
And a similar table words_verbs
with 36000 records.
Is it a good idea to define the following custom function:
CREATE OR REPLACE FUNCTION words_get_added(
in_visited integer,
OUT out_json jsonb
) RETURNS jsonb AS
$func$
DECLARE
_added text[];
BEGIN
-- create array with words added to dictionary since in_visited timestamp
IF in_visited > 0 THEN
_added := (
SELECT ARRAY_AGG(hashed)
FROM words_nouns
WHERE EXTRACT(EPOCH FROM added) > in_visited
UNION
SELECT ARRAY_AGG(hashed)
FROM words_verbs
WHERE EXTRACT(EPOCH FROM added) > in_visited
);
IF CARDINALITY(_added) > 0 THEN
out_json := jsonb_build_object('added', _added);
END IF;
END IF;
END
$func$ LANGUAGE plpgsql;
or should I better convert in_visited
to a timestamp with timezone and compare to that:
CREATE OR REPLACE FUNCTION words_get_added(
in_visited integer,
OUT out_json jsonb
) RETURNS jsonb AS
$func$
DECLARE
_added text[];
BEGIN
-- create array with words added to dictionary since in_visited timestamp
IF in_visited > 0 THEN
_added := (
SELECT ARRAY_AGG(hashed)
FROM words_nouns
WHERE added > TO_TIMESTAMP(in_visited)
UNION
SELECT ARRAY_AGG(hashed)
FROM words_verbs
WHERE added > TO_TIMESTAMP(in_visited)
);
IF CARDINALITY(_added) > 0 THEN
out_json := jsonb_build_object('added', _added);
END IF;
END IF;
END
$func$ LANGUAGE plpgsql;
Below are 2 EXPLAIN
outputs, but I am not sure how to interpret them:
# EXPLAIN SELECT ARRAY_AGG(hashed)
FROM words_nouns
WHERE EXTRACT(EPOCH FROM added) > 0
UNION
SELECT ARRAY_AGG(hashed)
FROM words_verbs
WHERE EXTRACT(EPOCH FROM added) > 0;
QUERY PLAN
---------------------------------------------------------------------------------------------
Unique (cost=2707.03..2707.04 rows=2 width=32)
-> Sort (cost=2707.03..2707.03 rows=2 width=32)
Sort Key: (array_agg(words_nouns.hashed))
-> Append (cost=1740.53..2707.02 rows=2 width=32)
-> Aggregate (cost=1740.53..1740.54 rows=1 width=32)
-> Seq Scan on words_nouns (cost=0.00..1684.66 rows=22348 width=32)
Filter: (date_part('epoch'::text, added) > '0'::double precision)
-> Aggregate (cost=966.45..966.46 rows=1 width=32)
-> Seq Scan on words_verbs (cost=0.00..936.05 rows=12157 width=32)
Filter: (date_part('epoch'::text, added) > '0'::double precision)
(10 rows)
# EXPLAIN SELECT ARRAY_AGG(hashed)
FROM words_nouns
WHERE added > to_timestamp(0)
UNION
SELECT ARRAY_AGG(hashed)
FROM words_verbs
WHERE added > to_timestamp(0);
QUERY PLAN
------------------------------------------------------------------------------------------------
Unique (cost=2361.99..2362.00 rows=2 width=32)
-> Sort (cost=2361.99..2361.99 rows=2 width=32)
Sort Key: (array_agg(words_nouns.hashed))
-> Append (cost=1517.06..2361.98 rows=2 width=32)
-> Aggregate (cost=1517.06..1517.07 rows=1 width=32)
-> Seq Scan on words_nouns (cost=0.00..1517.05 rows=1 width=32)
Filter: (added > '1970-01-01 01:00:00+01'::timestamp with time zone)
-> Aggregate (cost=844.88..844.89 rows=1 width=32)
-> Seq Scan on words_verbs (cost=0.00..844.88 rows=1 width=32)
Filter: (added > '1970-01-01 01:00:00+01'::timestamp with time zone)
(10 rows)
The question is: which of the 2 stored functions has better performance or do they not differ?
Upvotes: 1
Views: 665
Reputation: 656251
The key factor for performance is an index that matches your query. Typically, you'd have an index on the plain column added
, which should be matched with an input parameter of the same type, where the index is applicable.
For the task at hand, combine a plain index on added
with your second function (added > TO_TIMESTAMP(in_visited)
) - or a variation on the theme. The functional transformation happens before the value is compared to the column added
, thus the expression is "sargable".
For top read performance you might have a multicolumn index on (added, hashed)
, and keep your table vacuumed to allow index-only scans ...
Related:
Upvotes: 2