Bastl
Bastl

Reputation: 3006

DB2 UTF-8 encoding: Umlaut to CHAR(1)?

What does "CHAR(1)" in a UTF-8 encoded DB2 database mean?

Can I insert a special character (e.g. one that takes 2 octets in UTF-8) into a column of CHAR(1)?

Or does CHAR(1) in UTF-8 always mean, that it has capacity for one byte / octet, i.e. such that inserting an Umlaut into it will fail ?

I read through this interesting developerWorks article, but it's going to deep for my simple question...

Upvotes: 1

Views: 2488

Answers (1)

data_henrik
data_henrik

Reputation: 17176

It depends. :)

DB2 introduced code units to help with designing string-typed columns that are based on number of characters and not number of bytes. The CREATE TABLE statement has an overview of data types and also explains CHAR and VARCHAR. If number of characters is used, DB2 is assuming the worst case - 4 bytes/octets per character - for length computations.

The database configuration string_units determines whether by default the number of characters (CODEUNITS32) or the number of bytes (SYSTEM) is considered.

Coming back to your question: If you did not specify anything, inserting a special character that needs 2 octets into a CHAR(1) will likely fail. If CODEUNITS32 was specified, then it will succeed.

Upvotes: 5

Related Questions