Payel Senapati
Payel Senapati

Reputation: 1356

Column declared as NVARCHAR gets created as VARCHAR in MySQL. Both VARCHAR AND NVARCHAR declaration can store non latin characters

I am unable to create NVARCHAR data type in MySQL.

I have the following query -

CREATE TABLE table1 ( column1 NVARCHAR(10) );

This is supposed to create column1 that stores data type NVARCHAR(10). But the query -

DESCRIBE table1;

gives me the output -

+---------+-------------+------+-----+---------+-------+
| Field   | Type        | Null | Key | Default | Extra |
+---------+-------------+------+-----+---------+-------+
| column1 | varchar(10) | YES  |     | NULL    |       |
+---------+-------------+------+-----+---------+-------+

Thus instead of column1 that can store NVARCHAR(10) data type, column1 that can store VARCHAR(10) data type gets created.

Now only NVARCHAR data type is supposed to store non Latin characters.

But the query -

INSERT INTO table1 VALUES ("भारत");

Runs successfully without any error. Here "भारत" is a Hindi word in Devanagari script which in English sounds "Bharat" and translates to "India".

The query -

SELECT * FROM table1;

gives display as expected -

+--------------+
| column1      |
+--------------+
| भारत         |
+--------------+

I guess may be MySQL treats VARCHAR internally as NVARCHAR. But I can't find any documentation stating so.

The following is a link from MySQL developers website -

https://dev.mysql.com/doc/refman/8.0/en/charset-national.html

Here it says that NVARCHAR is fully supported.

To find out if non Latin characters can be stored in a column defined as VARCHAR I ran the following queries -

CREATE TABLE table2 ( column2 VARCHAR(10) );
DESCRIBE table2;

This gives me the output -

+---------+-------------+------+-----+---------+-------+
| Field   | Type        | Null | Key | Default | Extra |
+---------+-------------+------+-----+---------+-------+
| column2 | varchar(10) | YES  |     | NULL    |       |
+---------+-------------+------+-----+---------+-------+

Here column2 that can store VARCHAR(10) data type gets created as expected.

Running the query -

INSERT INTO table2 VALUES ("भारत");

runs without any error.

and the query -

SELECT * FROM table2;

gives expected output -

+--------------+
| column2      |
+--------------+
| भारत         |
+--------------+

Thus even if I declare column2 as VARCHAR(10) I can successfully store non Latin characters (here Devanagari characters of Hindi language).

The most logical conclusion is that regardless of declaring a column as VARCHAR or NVARCHAR MySQL always internally stores it as NVARCHAR. But I can't find any documentation regarding the same.

The following stackoverflow question gets closest to my question -

Issue Converting varchar to nvarchar mysql

But there is no answer provided to the question.

I am using operating system Ubuntu 20.04 and MySQL version - 8.0.26

Upvotes: 2

Views: 1829

Answers (2)

Rick James
Rick James

Reputation: 142298

Get in the habit of using SHOW CREATE TABLE instead of DESCRIBE. It would have answered your question.

mysql> CREATE TABLE nv ( column1 NVARCHAR(10) );
Query OK, 0 rows affected, 1 warning (0.05 sec)

mysql> show warnings;
+---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                                                                           |
+---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 3720 | NATIONAL/NCHAR/NVARCHAR implies the character set UTF8MB3, which will be replaced by UTF8MB4 in a future release. Please consider using CHAR(x) CHARACTER SET UTF8MB4 in order to be unambiguous. |
+---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> SHOW CREATE TABLE nv\G
*************************** 1. row ***************************
       Table: nv
Create Table: CREATE TABLE `nv` (
  `column1` varchar(10) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci
1 row in set (0.00 sec)

The Warning gives you a hint of an important problem, should you ever try to store Chinese or Emoji in the column. utf8mb4 is needed.

So, you should say

CREATE TABLE nv ( column1 VARCHAR(10) CHARACTER SET utf8mb4 );

That is, don't use NVARCHAR, use VARCHAR and specify the appropriate character set.

utf8 happens to be OK for DEVANAGARI, as in your example.

Upvotes: 1

nbk
nbk

Reputation: 49375

Which information you can save is stored in character set and collation.

so as the default is utf8, bith can save hindi or chines or kisuali in their 4 byites

but

CREATE TABLE table1 ( column1 NVARCHAR(10),column2 VARCHAR(10) );

Actually is treated slightly different

CREATE TABLE `table1` (
  `column1` varchar(10) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `column2` varchar(10) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

in the sample database the Default is

DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

But the national varchar is like the standard defines

CHARACTER SET utf8 COLLATE utf8_general_ci

For your hindi word "भारत" it makes no differenz, but for some charachters there can be "problems"

Upvotes: 1

Related Questions