Reputation: 1156
I have a query, say this one
SELECT to_char(regexp_substr(q'{select * from dual minus select * from dual; select * from dual minus select * from dual;}'
, '[^;]+', 1, LEVEL)) FROM dual
CONNECT BY to_char(regexp_substr(q'{select * from dual minus select * from dual; select * from dual minus select * from dual;}', '[^;]+', 1, LEVEL)) IS NOT NULL;
and it works fine - splits my row
select * from dual minus select * from dual; select * from dual minus select * from dual;
into two
select * from dual minus select * from dual
select * from dual minus select * from dual
Everything is fine until I add some line brakes, like this
SELECT to_char(regexp_substr(q'{select * from dual minus select * from dual;
select * from dual minus select * from dual;}'
, '[^;]+', 1, LEVEL)) FROM dual
CONNECT BY to_char(regexp_substr(q'{select * from dual minus select * from dual;
select * from dual minus select * from dual;}', '[^;]+', 1, LEVEL)) IS NOT NULL;
and here it turns to hell: sql treats ;
inside a string like an actual end of the query, ORA-01756 and stuff...
And everything is fine again if I add a random symbol after the ;
, this way
SELECT to_char(regexp_substr(q'{select * from dual minus select * from dual;%
select * from dual minus select * from dual;}'
, '[^;]+', 1, LEVEL)) FROM dual
CONNECT BY to_char(regexp_substr(q'{select * from dual minus select * from dual;%
select * from dual minus select * from dual;}', '[^;]+', 1, LEVEL)) IS NOT NULL;
Please explain this behavior and suggest a workaround.
UPD: tried this in different IDE (SQL developer instead of PL/SQL developer). No errors. Maybe it's all about encoding...
UPD2: SQLPlus works the same way PL/SQL developer does in this case. SQL developer seems to be a bit 'smarter'. Still, no idea why.
Upvotes: 2
Views: 235
Reputation: 36807
I created the open source project plsql_lexer to solve these kinds of problems.
Splitting can get tricky for complex SQL statements. And after the statements are split you will probably want to know what to do with them, and how to report on them. The procedures STATEMENT_CLASSIFIER.CLASSIFY and STATEMENT_FEEDBACK.GET_FEEDBACK_MESSAGE can help with that task.
Sample code
Here are some samples, starting with your example and adding a few other cases. Each sample splits the string into two statements.
declare
procedure print_split_strings(p_statements nclob) is
v_split_statements nclob_table;
begin
v_split_statements := statement_splitter.split(p_statements);
for i in 1 .. v_split_statements.count loop
dbms_output.put_line('Statement '||i||': '||v_split_statements(i));
end loop;
end;
begin
--This is a simple case.
print_split_strings('select * from dual minus select * from dual; select * from dual minus select * from dual;');
--Ignore semicolons in comments.
print_split_strings('select * from dual a;/* a comment ; */ select * from dual b;');
--Ignore semicolons in strings.
print_split_strings(q'{select '''' || q'!'!' from dual a;select * from dual b;}');
--Ignore semicolons in matching BEGIN/ENDs in PLSQL_DECLARATIONS:
print_split_strings('with function f return number is begin return 1; end; function g return number is begin return 2; end; select f from dual;select 1 from dual;');
end;
/
Statement 1: select * from dual minus select * from dual;
Statement 2: select * from dual minus select * from dual;
Statement 1: select * from dual a;
Statement 2: /* a comment ; */ select * from dual b;
Statement 1: select '''' || q'!'!' from dual a;
Statement 2: select * from dual b;
Statement 1: with function f return number is begin return 1; end; function g return number is begin return 2; end; select f from dual;
Statement 2: select 1 from dual;
Upvotes: 1
Reputation: 15473
Try:
SELECT to_char(regexp_substr(q'{select * from dual minus select * from dual;
select * from dual minus select * from dual;}'
, '[^;[:cntrl:]]+', 1, LEVEL)) FROM dual
CONNECT BY to_char(regexp_substr(q'{select * from dual minus select * from dual;
select * from dual minus select * from dual;}', '[^;[:cntrl:]]+', 1, LEVEL)) IS NOT NULL;
Output:
select * from dual minus select * from dual
select * from dual minus select * from dual
The reason is the regexp_substr looks for the next pattern you specify, which in your case was originally [^;]+ and this pattern would find the ; and the next char would be the linefeed. The simple solution if you wanted to break up the lines like this is to also exclude control chars in the regex search via [:cntrl:]
Upvotes: 1