Dave Jarvis
Dave Jarvis

Reputation: 31171

Date arithmetic using integer values

Problem

String concatenation is slowing down a query:

date(extract(YEAR FROM m.taken)||'-1-1') d1,
date(extract(YEAR FROM m.taken)||'-1-31') d2

This is realized in code as part of a string, which follows (where the p_ variables are integers, provided as input by end users):

date(extract(YEAR FROM m.taken)||''-'||p_month1||'-'||p_day1||''') d1,
date(extract(YEAR FROM m.taken)||''-'||p_month2||'-'||p_day2||''') d2

This part of the query runs in 3.2 seconds with the dates, and 1.5 seconds without, leading me to believe there is ample room for improvement.

The query's total run time is under 10 seconds; am looking to bring the entire query down to about 2 or 3 seconds. A hardware upgrade has already happened. ;-)

Version

PostgreSQL 8.4.4.

Question

What is a better way to create the date (presumably without concatenation)?

Update

This looks promising: PGTYPESdate_mdyjul

Many thanks!

Upvotes: 2

Views: 428

Answers (3)

rfusca
rfusca

Reputation: 7705

Wow. I'm surprised, but using the functions from this page - specifically the one to build a date value from three integers - which really do nothing more expose the internal C date functions, really is a lot faster. Benchmarking for me show that creating the dates that way was much faster.

First one is the implementation of the "dateserial" function:

postgres=# select to_date(a,1,3) 
postgres-# from generate_series(100,1000000) as v(a);

Time: 1365.851 ms

postgres=# select (a::text||'-01-03')::date from 
postgres-# generate_series(100,1000000) as v(a);

Time: 3454.224 ms

Full Solution

Edit dateserial.c:

#include "postgres.h"
#include "utils/date.h"
#include "utils/nabstime.h"

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

Datum dateserial(PG_FUNCTION_ARGS);

PG_FUNCTION_INFO_V1 (dateserial);

Datum
dateserial(PG_FUNCTION_ARGS) {
  int32 p_year = PG_GETARG_INT32(0);
  int32 p_month = PG_GETARG_INT32(1);
  int32 p_day = PG_GETARG_INT32(2);

  PG_RETURN_DATEADT( date2j( p_year, p_month, p_day ) - POSTGRES_EPOCH_JDATE );
}

Edit Makefile:

MODULES = dateserial
PGXS := $(shell pg_config --pgxs)
include $(PGXS)

Edit inst.sh (optional):

#!/bin/bash

make clean && make && strip *.so && make install && /etc/init.d/postgresql-8.4 restart

Run bash inst.sh.

Create a SQL function dateserial:

CREATE OR REPLACE FUNCTION dateserial(integer, integer, integer)
  RETURNS date AS
'$libdir/dateserial', 'dateserial'
  LANGUAGE 'c' IMMUTABLE STRICT
  COST 1;
ALTER FUNCTION dateserial(integer, integer, integer) OWNER TO postgres;

Test the function:

SELECT dateserial( 2007::int, 5, 5 )

Upvotes: 1

leonbloy
leonbloy

Reputation: 75916

Sadly, I dont think there is other way to build a date without texts concatenation.

Yes, frankly, I dislike the aproach Postgresql have here. It seems that most date manipulation must be made by extracting date fields as integers, casting them as text, appending them to more texts to create a textual representation of a date, and then telling postgres to parse that text as date... This smells bad to me, I instictively feel that building a date by parsing a string should be only done from textual inputs. But, I think, postgresql ties too strongly the data types handling with their textual representations. And so, for example, if I want to build a date from three integer values (D,M,Y) I MUST (if I'm not mistaken) build a string and make PG parse it. I feel so unclean doing this...

Rant aside, I doubt that this can slow down much your performance.

Upvotes: 2

John
John

Reputation: 15286

Another alternative would be to create a function index on the concatenation. This works in more general cases where there isn't a better data type available.

Upvotes: 0

Related Questions