Reputation: 7199
Short version of the question: Is there a way of counting how many bytes will it take to store the characters of a string in a VARCHAR(n)
column in an Oracle database?
Longer version: The following Oracle SQL script will fail at the third statement. It will try to insert 10 characters in a VARCHAR(10)
column; however, one of these characters is the A with an acute accent.
CREATE TABLE TESTTABLE (NAME VARCHAR(10) NULL);
INSERT INTO TESTTABLE (NAME) VALUES ('abcdefghij');
--1 rows inserted.
INSERT INTO TESTTABLE (NAME) VALUES ('ábcdefghij');
--ORA-12899: value too large for column "ACME"."TESTTABLE"."NAME" (actual: 11, maximum: 10)
My C# application stores strings in an Oracle database and I can't just change the column type to NVARCHAR2(10)
, which would work. The application is expected to trim any larger string to a limit of 10 characters, so Oracle won't complain about its length. But a String.Length
-based trim is a very naif strategy: it would blindly leave "ábcdefghij" intact with all its 10 CLR characters.
How can I detect that 'á' will take 2 bytes in the database row so I can trim the string to "ábcdefghi" before issuing the INSERT
statement?
EDIT: this question is similar to Best way to shorten UTF8 string based on byte length
Upvotes: 1
Views: 514
Reputation: 231661
By default, VARCHAR2(10)
allocates 10 bytes of storage which may or may not equate to 10 characters depending on the database character set and the data. You can, however, declare the column so that it will always store 10 characters.
When you are using a variable-width character set, you almost always want to declare the column using character-length semantics VARCHAR2(10 CHAR)
. Then you can always store 10 characters whether you happen to have some characters that require more than 1 byte of storage.
If you're stuck declaring columns using byte-length semantics for some reason, you could use the LENGTHB
or VSIZE
function to return the length of the string in bytes.
select lengthb('ábcdefghij'), vsize('ábcdefghij')
from dual;
will return 11 for both columns of the result.
Upvotes: 4
Reputation: 23727
Oracle function length(string)
returns number of characters, lengthb(string)
returns number of bytes.
Upvotes: 5
Reputation: 2817
You can get the lenght in bytes of the string like this:
UTF8Encoding Encoding = new UTF8Encoding();
byte[] UTF8String = Encoding.GetBytes("ábcdefghij");
int StringLenght = UTF8String.Length
In fact, for your example, it returns 11.
Upvotes: 1
Reputation: 106530
This depends on the encoding you have Oracle set to use. You convert the string instance into a byte array using the corresponding System.Text.Encoding
instance, such as System.Text.Encoding.UTF8
. (You're looking for the "GetBytes" method)
Upvotes: 4