pOrinG
pOrinG

Reputation: 935

Validate Oracle Column Names

In one scenario we are dynamically creating sql to create temp tables on-fly. There is no issue with table_name as it is decided by us however the column-names are provided by sources not in our control.

Usually we would check the column names using below query:

select  ..
where NOT REGEXP_LIKE (Column_Name_String,'^([a-zA-Z])[a-zA-Z0-9_]*$') 
OR Column_Name_String is NULL
OR Length(Column_Name_String) > 30

However is there any build in function which can do a more extensive check. Also any input on the above query is welcome as well.

Thanks in advance.


Final query based on below answers:

select  ..
where NOT REGEXP_LIKE (Column_Name_String,'^([a-zA-Z])[a-zA-Z0-9_]{0,29}$') 
OR Column_Name_String is NULL
OR Upper(Column_Name_String) in (select Upper(RESERVED_WORDS.Keyword) from V$RESERVED_WORDS RESERVED_WORDS)

Particularly not happy with character's like $ in column name either hence won't be using..

dbms_assert.simple_sql_name('VALID_NAME')

Instead with regexp I can decide my own set of character's to allow.

Upvotes: 5

Views: 2813

Answers (2)

APC
APC

Reputation: 146239

" is there any build in function which can do a more extensive check."

Oracle has the DBMS_ASSERT.SIMPLE_SQL_NAME() function. This returns the passed name if it meets the Oracle naming rules ...

 select dbms_assert.simple_sql_name('VALID_NAME') from dual;

... and hurls ORA-44003 if the name is invalid.

Valid names permit any characters if the name is double-quoted (yuck, but then so is creating "temp tables on-fly"). Also the function doesn't check the length of the name, so you will still need to validate that yourself.

Find out more in the docs.

Also here is a SQL Fiddle.


"creating a table with comment column is not possible as its a invalid identifier"

Fair point. DBMS_ASSERT is primarily aimed at preventing SQL injection. So it verifies that a value conforms to Oracle's naming rules, not that the value is a valid Oracle name. To catch things like comment you will also need to check the value against V$RESERVED_WORDS, probably where reserved != 'Y'. As this is a V$ view select on it is not granted by default; if you don't have access you'll need to ask your friendly DBA to help out.

" For validating column names I believe I should check with the entire list"

Up to you. The distinction is that some keywords can legitimately be used as identifiers. For instance TYPE only became a reserved word in Oracle version 8 when they introduced the object-relational stuff. But there were a lot of tables and views in existing systems which used 'TYPE' as a column name (not least the Oracle data dictionary). If Oracle had made TYPE a properly reserved word it would have broken all those systems. So the list of reserved words which cannot be used as identifiers is a sub-set of all the Oracle keywords.


Opinions on the general task:

"we are getting data from external sources (files) and the job of the program/script is to push that data to oracle tables."

There are two parts to this task.

The first is that you should have agreed a standard format for these files with the third parties. There should be no need for discovery of the files' structure or content. (Or if there is such a need because the files are randomly sourced from a carousel of third parties probably you should not be using a relational database but something else: Endeca? Python Pandas library?)

The second is the creating tables on the fly. If you have an agreed file structure then you should be loading into standard tables, using either SQL*Loader or external tables according to your circumstances. If you're on 12c maybe SQL*Loader Express Mode could be of interest.

Upvotes: 4

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521339

This answer does not necessarily offer either a performance or logical improvement, but you can actually validate the column names using a single regex:

SELECT ...
WHERE NOT
    REGEXP_LIKE (COALESCE(Column_Name_String, ''), '^([a-zA-Z])[a-zA-Z0-9_]{0,29}$')

This works because:

  • It uses the same pattern to match columns, i.e. starting with a letter and afterwards using only alphanumeric characters and underscore
  • NULL column names are mapped to empty string, which fails the regex
  • We use a length quantifier {0,29} to check the column length directly in the regex

Upvotes: 5

Related Questions