user188619
user188619

Reputation:

How do I figure out if a column size is in bytes or characters from the column metadata?

I am working on some software that has to create dummy entries in various databases (Oracle, DB2, SQLServer) temporarily. Every column in the the row is filled with random data.

The code uses java.sql.DataBaseMetaData class in java to get the COLUMN_SIZE attribute to figure out how large of a random string to store in the VARCHAR2 and other string column types.

DatabaseMetaData metadata = connection.getMetaData();
while (resultSet.next()) {
  ResultSet resultSet = metadata.getColumns(...);
  int size = resultSet.getInt("COLUMN_SIZE");
}

The problem is, at least in Oracle, I can't seem to figure out if a column length is being returned in bytes or characters. Also, depending on the encoding, characters are a different number of bytes. Combine all this, and I am getting some SQL errors because the strings that are trying to get inserted are too big. Am I missing something obvious?

Upvotes: 4

Views: 4386

Answers (2)

varad
varad

Reputation: 1626

The CHAR_OCTET_LENGTH works for me. Tested on Oracle 11g, ojdbc11: 21.6.0.0.1

The CHAR_OCTET_LENGTH returns the number of bytes the column can hold. Which is what you need to figure out if your data fits into the column or not.

How to figure our how the column is defined:

  • If CHAR_OCTET_LENGTH == COLUMN_SIZE then the column is defined as Varchar2(COLUMN_SIZE).

  • If CHAR_OCTET_LENGTH > COLUMN_SIZE then the column is defined as Varchar2(COLUMN_SIZE CHAR)

  • The only exceptions is a length of 4000: hard limit of CHAR_OCTET_LENGTH is 4000 therefore you cannot distinguish varchar2(4000) and varchar2(4000 CHAR).

Upvotes: 0

Dave Costa
Dave Costa

Reputation: 48111

I don't see any way to determine this through the JDBC metadata. There is another column called CHAR_OCTET_LENGTH but in my experiment it did not help distinguish between byte and char semantics.

The way to find this out in the Oracle data dictionary is to look at ALL_TAB_COLUMNS.CHAR_USED:

    CHAR_USED   VARCHAR2(1)     
        B | C.
        B indicates that the column uses BYTE length semantics.
        C indicates that the column uses CHAR length semantics.
        NULL indicates the datatype is not any of the following:

        * CHAR
        * VARCHAR2
        * NCHAR
        * NVARCHAR2

So you could check whether your are connected to Oracle and, if so, do a separate query against this view to check the size semantics.

Upvotes: 5

Related Questions