Reputation: 339
I am needing to better understand user text input into win applications / asp.net applications, and what the appropriate field sizes should be for the data stored in SQLServer.
If everything were ASCII, it seems like this would be simple (1 byte for each char), but I guess I really don't understand what is going on when a user puts text into an input field. If the input is in UniCode then there are (generally) 2 bytes per character (?) and if I know a text input can not be cany longer than 5 characters, then should the SQL column be varchar(10)??? How do I know if an input should be in ANSI or Unicode??
Hopefully this makes sense. This is something that I have never fully understood in terms of how a web page or a win app determines how the data is encoded.
Upvotes: 0
Views: 115
Reputation: 6791
SQL columns are not set by the byte size but the character size. A column of varchar(10)
will accept 10 ccharacters. if you going to be taking Unicode input it is best to set nvarchar(10)
this will still take 10 characters but it will allow all Unicode input to that column. The same goes for ntext text nchar char
. A good MSDN page to understand SQL data types can be found at http://technet.microsoft.com/en-us/library/ms187752.aspx. As for what is going into your text boxes on the ASP.NET site, anything can be inputed into that text box, it is up to you via code to enforce the rules of what you want inputed.
Upvotes: 2
Reputation: 280644
When you create a column, you specify the number of characters you need to store, regardless of whether it is Unicode or not. Need up to 5 characters? Then it's either VARCHAR(5)
or NVARCHAR(5)
, depending on whether you actually need Unicode or not - that's a business discussion, not a technical one. The 2 bytes has nothing to do with the column definition - that's about storage size. So a VARCHAR(5)
will take 5 bytes if fully populated, and an NVARCHAR(5)
will take 10 bytes if fully populated. You don't have to worry about those implementation details when defining the column; however you should be sure that Unicode is required before making the choice, because doubling the space requirement for no reason is wasteful.
(Ignoring arguments about whether such a column should be CHAR/NCHAR
, null byte overhead, etc.)
Upvotes: 4