David S
David S

Reputation: 13871

regexp_matches returns NULL inside plpgsql function

Given a filename of:

xxxx/2013-02/csv/Sales_1302040000-1302050000.zip

Can someone explain why regexp_matches returns null in this function:

CREATE OR REPLACE FUNCTION get_import_batch_date(filename text) 
RETURNS DATE AS
$BODY$    
DECLARE
    matches text[];
    result date;
BEGIN

    matches := regexp_matches(filename, E'Sales_(\\d{2})(\\d{2})(\\d{2})');    
    IF matches IS NOT NULL THEN
        result := format('%s-%s-%s', 2000 + matches[1]::int, matches[2], matches[3])::DATE;
        RETURN result;
    END IF;

    RAISE WARNING 'Unable to determine batch date from %', filename;

    RETURN NULL;

END;
$BODY$
  LANGUAGE plpgsql IMMUTABLE;

yet, works in the following anonymous function:

DO language plpgsql $$
DECLARE
    filename text := 'xxxx/2013-02/csv/Sales_1302040000-1302050000.zip';
    matches text[];
    result date;
BEGIN

    matches := regexp_matches(filename, E'Sales_(\\d{2})(\\d{2})(\\d{2})');    
    IF matches IS NOT NULL THEN
        result := format('%s-%s-%s', 2000 + matches[1]::int, matches[2], matches[3])::DATE;
        raise notice '%', result;
    END IF;

END;
$$;      

And the regexp_matches seems to work correctly in this query, but again, the function fails and returns null

SELECT
    regexp_matches('xxxx/2013-02/csv/Sales_1302040000-1302050000.zip', E'Sales_(\\d{2})(\\d{2})(\\d{2})'),
    get_import_batch_date('xxxx/2013-02/csv/Sales_1302040000-1302050000.zip');

Is there a bug in my code that I'm just not seeing (very possible and the most common answer) Or is there something I'm failing to do here?

I'm using PostgreSQL 9.1.6

Just a final note: given this filename, I want the function to return a date value of 2013-02-04

Upvotes: 1

Views: 1913

Answers (3)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 657932

Update:

The problem turned out to be a confusion over pgScript in pgAdmin. @David pressed F6 in the query tool of pgAdmin to run pgScript instead of F5 for running an SQL script. See the comments below.
The function itself is fine.

Simplified function

I can't reproduce your error (tested on Postgres 9.1.6, didn't return NULL), but I can offer you a much simpler version of your function that probably won't fail:

CREATE OR REPLACE FUNCTION get_import_batch_date(filename text, OUT result date)
  AS
$func$    
BEGIN
   result := ('20' || substring(filename, E'Sales_(\\d{6})'))::date;

   IF result IS NULL THEN
      RAISE WARNING 'Unable to determine batch date from %', filename;
   END IF;
END
$func$ LANGUAGE plpgsql IMMUTABLE;
  • Use an OUT parameter to simplify things.

  • No need for the rather complex regexp_matches() expression and the array conversion it entails. A simple substring() call does the job. Prepend 20 and you convert to date tight away. The format matches ISO 8601 date format which is valid in any locale. Your original version relies on that as well, just with added hyphens (-), which are optional.

     `'20130204'::date` works just as well as `'2013-02-04'::date`
    

  • No RETURN needed, the value of the OUT parameter result is returned automatically.

Upvotes: 2

David S
David S

Reputation: 13871

Ok! I finally figured it out. I'm not sure WHY this happens, or what is going on, but I can at least fix it. The answer I'm posting here is actually based on Erwin's answer. His code (as usual) is way better than mine, but this works if anyone else ever has this EXTREMELY frustrating problem in the future.

Basically, I was playing around with it again tonight and it finally caught my eye what was going on. If I take this code:

CREATE OR REPLACE FUNCTION get_import_batch_date(in filename text, out result date) AS
$BODY$
DECLARE
BEGIN
   result := substring(filename, E'Sales_(\\d{6})')::date;
   IF result IS NULL THEN
      RAISE WARNING 'Unable to determine batch date from %', filename;
   END IF;   
END
$BODY$
  LANGUAGE plpgsql IMMUTABLE
  COST 100;

...and hit F6 to "run script", you get the following message back:

[QUERY    ] CREATE OR REPLACE FUNCTION get_import_batch_date(in filename text, out result date) AS
            $BODY$
            DECLARE
            BEGIN
               result := substring(filename, E'Sales_(\d{6})')::date;
               IF result IS NULL THEN
                  RAISE WARNING 'Unable to determine batch date from %', filename;
               END IF;   
            END
            $BODY$
              LANGUAGE plpgsql IMMUTABLE
              COST 100

Can you spot the critical issue? I couldn't last night, but did tonight. It is stripping out one of the "\" on the substring function.

This will cause the match to fail and NULL to be returned.

If you hit F5 or click the "Run" button the function, then it works fine. (which is probably what people were doing or maybe what SQLFiddle is doing (total guess here).

To get F6 to work for me, I had to change the line to:

   result := substring(filename, E'Sales_(\\\d{6})')::date;

So, that's what works for me. This feels like a bug somewhere. But, I don't know where. Maybe @Erwin can shed some light on this.

Upvotes: 0

Steve Chambers
Steve Chambers

Reputation: 39434

Works here too: http://sqlfiddle.com/#!1/d084b/1

Are you sure that was exactly the filename passed into get_import_batch_date?

Upvotes: 1

Related Questions