miracle_the_V
miracle_the_V

Reputation: 1156

Oracle sql terminator screening (+strange behavior)

I have a query, say this one

SELECT to_char(regexp_substr(q'{select * from dual minus select * from dual; select * from dual minus select * from dual;}'
, '[^;]+', 1, LEVEL)) FROM dual
CONNECT BY to_char(regexp_substr(q'{select * from dual minus select * from dual; select * from dual minus select * from dual;}', '[^;]+', 1, LEVEL)) IS NOT NULL;

and it works fine - splits my row

select * from dual minus select * from dual; select * from dual minus select * from dual;

into two

select * from dual minus select * from dual
select * from dual minus select * from dual

Everything is fine until I add some line brakes, like this

SELECT to_char(regexp_substr(q'{select * from dual minus select * from dual; 
select * from dual minus select * from dual;}'
, '[^;]+', 1, LEVEL)) FROM dual
CONNECT BY to_char(regexp_substr(q'{select * from dual minus select * from dual; 
select * from dual minus select * from dual;}', '[^;]+', 1, LEVEL)) IS NOT NULL;

and here it turns to hell: sql treats ; inside a string like an actual end of the query, ORA-01756 and stuff...


And everything is fine again if I add a random symbol after the ;, this way

SELECT to_char(regexp_substr(q'{select * from dual minus select * from dual;%
select * from dual minus select * from dual;}'
, '[^;]+', 1, LEVEL)) FROM dual
CONNECT BY to_char(regexp_substr(q'{select * from dual minus select * from dual;%
select * from dual minus select * from dual;}', '[^;]+', 1, LEVEL)) IS NOT NULL;

Please explain this behavior and suggest a workaround.

UPD: tried this in different IDE (SQL developer instead of PL/SQL developer). No errors. Maybe it's all about encoding...
UPD2: SQLPlus works the same way PL/SQL developer does in this case. SQL developer seems to be a bit 'smarter'. Still, no idea why.

Upvotes: 2

Views: 235

Answers (2)

Jon Heller
Jon Heller

Reputation: 36807

I created the open source project plsql_lexer to solve these kinds of problems.

Splitting can get tricky for complex SQL statements. And after the statements are split you will probably want to know what to do with them, and how to report on them. The procedures STATEMENT_CLASSIFIER.CLASSIFY and STATEMENT_FEEDBACK.GET_FEEDBACK_MESSAGE can help with that task.

Sample code

Here are some samples, starting with your example and adding a few other cases. Each sample splits the string into two statements.

declare
    procedure print_split_strings(p_statements nclob) is
        v_split_statements nclob_table;
    begin
        v_split_statements := statement_splitter.split(p_statements);
        for i in 1 .. v_split_statements.count loop
            dbms_output.put_line('Statement '||i||': '||v_split_statements(i));
        end loop;
    end;
begin
    --This is a simple case.
    print_split_strings('select * from dual minus select * from dual; select * from dual minus select * from dual;');
    --Ignore semicolons in comments.
    print_split_strings('select * from dual a;/* a comment ; */ select * from dual b;');
    --Ignore semicolons in strings.
    print_split_strings(q'{select '''' || q'!'!' from dual a;select * from dual b;}');
    --Ignore semicolons in matching BEGIN/ENDs in PLSQL_DECLARATIONS:
    print_split_strings('with function f return number is begin return 1; end; function g return number is begin return 2; end; select f from dual;select 1 from dual;');
end;
/

Statement 1: select * from dual minus select * from dual;
Statement 2:  select * from dual minus select * from dual;
Statement 1: select * from dual a;
Statement 2: /* a comment ; */ select * from dual b;
Statement 1: select '''' || q'!'!' from dual a;
Statement 2: select * from dual b;
Statement 1: with function f return number is begin return 1; end; function g return number is begin return 2; end; select f from dual;
Statement 2: select 1 from dual;

Upvotes: 1

tbone
tbone

Reputation: 15473

Try:

SELECT to_char(regexp_substr(q'{select * from dual minus select * from dual;
select * from dual minus select * from dual;}'
, '[^;[:cntrl:]]+', 1, LEVEL)) FROM dual
CONNECT BY to_char(regexp_substr(q'{select * from dual minus select * from dual;
select * from dual minus select * from dual;}', '[^;[:cntrl:]]+', 1, LEVEL)) IS NOT NULL;

Output:

select * from dual minus select * from dual
select * from dual minus select * from dual

The reason is the regexp_substr looks for the next pattern you specify, which in your case was originally [^;]+ and this pattern would find the ; and the next char would be the linefeed. The simple solution if you wanted to break up the lines like this is to also exclude control chars in the regex search via [:cntrl:]

Upvotes: 1

Related Questions