Reputation: 646
Ok, here is the layout:
I have a bunch of uuid
data that is in varchar
format. I know uuid
is its own type. This is how I got the data. So to verify this which ones are uuid
, I take the uuid
in type varchar
and insert it into a table where the column is uuid
. If the insert fails, then it is not a uuid
type. My basic question is how to delete the bad uuid
if the insert fails. Or, how do I delete out of one table if an insert fails in another table.
My first set of data:
drop table if exists temp1;
drop sequence if exists temp1_id_seq;
CREATE temp table temp1 (id serial, some_value varchar);
INSERT INTO temp1(some_value)
SELECT split_part(name,':',2) FROM branding_resource WHERE name LIKE '%curric%';
create temp table temp2 (id serial, other_value uuid);
CREATE OR REPLACE function verify_uuid() returns varchar AS $$
DECLARE uu RECORD;
BEGIN
FOR uu IN select * from temp1
LOOP
EXECUTE 'INSERT INTO temp2 values ('||uu.id||','''|| uu.some_value||''')';
END LOOP;
END;
$$
LANGUAGE 'plpgsql' ;
select verify_uuid();
When I run this, I get the error
ERROR: invalid input syntax for uuid:
which is what I expect. There are some bad uuids in my data set.
My research led me to Trapping Errors - Exceptions with UPDATE
/INSERT
in the docs.
Narrowing down to the important part:
BEGIN
FOR uu IN select * from temp1
LOOP
begin
EXECUTE 'INSERT INTO temp2 values ('||uu.id||','''|| uu.some_value||''')';
return;
exception when ??? then delete from temp1 where some_value = uu.some_value;
end;
END LOOP;
END;
I do not know what to put instead of ???
. I think it relates to the ERROR: invalid input syntax for uuid:
, but I am not sure. I am actually not even sure if this is the right way to go about this?
Upvotes: 2
Views: 981
Reputation: 657022
@Craig shows a way to identify the SQLSTATE
.
You an also use pgAdmin, which shows the SQLSTATE
by default:
SELECT some_value::uuid FROM temp1
> ERROR: invalid input syntax for uuid: "-a0eebc999c0b4ef8bb6d6bb9bd380a11"
> SQL state: 22P02
I am going to address the bigger question:
I am actually not even sure if this is the right way to go about this?
Your basic approach is the right way: the 'parking in new york' method
(quoting Merlin Moncure in this thread on pgsql-general). But the procedure is needlessly expensive. Probably much faster:
You should be able to weed out the lion's share of violating strings with a much cheaper regexp test.
Postgres accepts a couple of different formats for UUID in text
representation, but as far as I can tell, this character class should covers all valid characters:
'[^A-Fa-f0-9{}-]'
You can probably narrow it down further for your particular brand of UUID representation (Only lower case? No curly braces? No hyphen?).
CREATE TEMP TABLE temp1 (id serial, some_value text);
INSERT INTO temp1 (some_value)
SELECT split_part(name,':',2)
FROM branding_resource
WHERE name LIKE '%curric%'
AND split_part(name,':',2) !~ '[^A-Fa-f0-9{}-]';
"Does not contain illegal characters."
Instead of filling another table, it should be much cheaper to just delete (the now few!) violating rows:
CREATE OR REPLACE function f_kill_bad_uuid()
RETURNS void AS
$func$
DECLARE
rec record;
BEGIN
FOR rec IN
SELECT * FROM temp1
LOOP
BEGIN
PERFORM rec.some_value::uuid; -- no dynamic SQL needed
-- do not RETURN! Keep looping.
RAISE NOTICE 'Good: %', rec.some_value; -- only for demo
EXCEPTION WHEN invalid_text_representation THEN
RAISE NOTICE 'Bad: %', rec.some_value; -- only for demo
DELETE FROM temp1 WHERE some_value = rec.some_value;
END;
END LOOP;
END
$func$ LANGUAGE plpgsql;
No need for dynamic SQL. Just cast. Use PERFORM
, since we are not interested in the result. We just want to see if the cast goes through or not.
Not return value. You could count and return the number of excluded rows ...
For a one-time operation you could also use a DO
statement.
And do not quote the language name
. It's an identifier, not a string.'plpgsql'
Upvotes: 1
Reputation: 324521
You can get the SQLSTATE code from psql
using VERBOSE
mode, e.g:
regress=> \set VERBOSITY verbose
regress=> SELECT 'fred'::uuid;
ERROR: 22P02: invalid input syntax for uuid: "fred"
LINE 1: SELECT 'fred'::uuid;
^
LOCATION: string_to_uuid, uuid.c:129
Here we can see that the SQLSTATE
is 22P02
. You can use that directly in the exception clause, but it's generally more readable to look it up in the manual to find the text representation. Here, we see that 22P02
is invalid_text_representation
.
So you can write exception when invalid_text_representation then
...
Upvotes: 2