lukevancleve
lukevancleve

Reputation: 35

Compact constraint for range types in PostgreSQL

I have a field that needs to always have one valid row for each date. From the documentation I see how to prevent overlap via EXCLUDE USING gist. How can I ensure that the data will always be compact i.e. no holes in it?

CREATE TABLE observations (
    vintage daterange,
    value numeric,
    EXCLUDE USING gist (vintage WITH &&)
);

How can I ensure that there is a value for every date? In the actual application there is a first observation, the latest observation has a null as the second field in the daterange.

Upvotes: 3

Views: 388

Answers (1)

Patrick
Patrick

Reputation: 32244

Assuming that you enter data in chronological order, your best bet is probably an INSERT trigger. This because you need not only insert new data, but you also need to close the date range on the most recent entry before the current one.

Solution with daterange

CREATE FUNCTION trf_bi_observations() RETURNS trigger AS $$
DECLARE
  recent_date date;
BEGIN
  -- Find the most recently entered date
  SELECT max(lower(vintage)) INTO recent_date
  FROM observations;

  -- Make sure the new date range makes sense: it has to be after the most recently
  -- entered date range and it has to have an open upper range.
  IF lower(NEW.vintage) <= recent_date OR NOT upper_inf(NEW.vintage) THEN
    RAISE NOTICE 'New vintage must be more recent than latest vintage and have an open upper bound';
    RETURN NULL; -- Fail the insert
  END IF;

  -- Update the most recent record to close the upper bound of vintage
  UPDATE observations
  SET vintage = daterange(recent_date, lower(NEW.vintage)) -- upper bound exclusive
  WHERE lower(vintage) = recent_date;

  -- All done, make the INSERT happen
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER tr_bi_observations
BEFORE INSERT ON observations
FOR EACH ROW EXECUTE PROCEDURE trf_bi_observations();

Note that with this INSERT trigger you do not need the EXCLUDE constraint anymore because the logic of the trigger function doesn't allow overlaps. You do have to consider updates and deletes, however. If you want to disallow deletes on this table, you can simply revoke the privilege or have a BEFORE DELETE trigger that does nothing but RETURN NULL (which will make the DELETE operation fail). You must allow updates (or the insert trigger above will fail), but you can restrict that to changes in the value column and closing the upper bound of the most recent row:

-- OLD upper bound must be open, else disallow change to vintage column
IF upper_inf(OLD.vintage) THEN
  -- Only allow changes in upper bound
  NEW.vintage = daterange(lower(OLD.vintage), upper(NEW.vintage));
ELSE
  NEW.vintage = OLD.vintage -- silently ignore changes
END IF;

Solution without daterange

Given your requirements - no overlaps, no gaps - you can greatly simplify your data model by simply working with a date for the "vintage" column, instead of a daterange. The range can always be constructed from the date of the current row to the date of the following role. This implicitly satisfies both requirements and the trigger function becomes much simpler:

CREATE FUNCTION trf_bi_observations2() RETURNS trigger AS $$
BEGIN
  -- Make sure the new date makes sense: it has to be after the most recently
  -- entered date.
  PERFORM * FROM observations WHERE vintage >= NEW.vintage;
  IF FOUND THEN
    RAISE NOTICE 'New vintage must be more recent than latest vintage';
    RETURN NULL; -- Fail the insert
  END IF;

  -- All done, make the INSERT happen
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER tr_bi_observations2
BEFORE INSERT ON observations
FOR EACH ROW EXECUTE PROCEDURE trf_bi_observations2();

Now you don't have to update the most recent row with the open upper bound, so you can disallow any updates on column "vintage". You can also decide to allow both inserts and updates with any date, such that the date ranges of individual rows change with such events. Up to you. In either case, working with a simple date value is much easier and faster than working with a daterange.

You can still retrieve the data with "vintage" in daterange format with the right SELECT statement:

SELECT daterange(vintage, lead(vintage) OVER (ORDER BY vintage ASC)) AS vintage, value
FROM observations;

The lead() window function will give you the date of the "vintage" column for the next row in the window frame, which is the next date due to the ordering.

Upvotes: 2

Related Questions