Reputation: 10665
I am setting up a very small MySQL database that stores, first name, last name, email and phone number and am struggling to find the 'perfect' datatype for each field. I know there is no such thing as a perfect answer, but there must be some sort of common convention for commonly used fields such as these. For instance, I have determined that an unformatted US phone number is too big to be stored as an unsigned int, it must be at least a bigint.
Because I am sure other people would probably find this useful, I dont want to restrict my question to just the fields I mentioned above.
What datatypes are appropriate for common database fields? Fields like phone number, email and address?
Upvotes: 122
Views: 130321
Reputation: 20881
Use: INT(11)
.
MySQL indexes will be able to parse through an int list fastest.
Use: BINARY(x)
, or BLOB(x)
.
You can store security tokens, etc., as hex directly in BINARY(x) or BLOB(x). To retrieve from binary
-type, use SELECT HEX(field)...
or SELECT ... WHERE field = UNHEX("ABCD....")
.
Use: DATETIME
, DATE
, or TIME
.
Always use DATETIME
if you need to store both date and time (instead of a pair of fields), as a DATETIME
indexing is more amenable to date-comparisons in MySQL.
Use: BIT(1)
(MySQL 8-only.) Otherwise, use BOOLEAN(1)
.
BOOLEAN
is actually just an alias of TINYINT(1)
, which actually stores 0 to 255 (not exactly a true/false, is it?).
Use: INT(11)
.
VARCHAR or other types of fields won't work with the SUM()
, etc., functions.
Use: TEXT.
Max limit is 65,535.
Use: MEDIUMTEXT.
Max limit is 16,777,215.
Use: LONGTEXT.
Max limit is 4,294,967,295.
Use : VARCHAR(255)
.
UTF-8 characters can take up three characters per visible character, and some cultures do not distinguish firstname and lastname. Additionally, cultures may have disagreements about which name is first and which name is last. You should name these fields Person.GivenName
and Person.FamilyName
.
Use : VARCHAR(256)
.
The definition of an e-mail path is set in RFC821 in 1982. The maximum limit of an e-mail was set by RFC2821 in 2001, and these limits were kept unchanged by RFC5321 in 2008. (See the section: 4.5.3.1. Size Limits and Minimums.) RFC3696, published 2004, mistakenly cites the email address limit as 320
characters, but this was an "info-only" RFC that explicitly "defines no standards" according to its intro, so disregard it.
Use: VARCHAR(255)
.
You never know when the phone number will be in the form of "1800...", or "1-800", or "1-(800)", or if it will end with "ext. 42", or "ask for susan".
Use: VARCHAR(10)
.
You'll get data like 12345
or 12345-6789
. Use validation to cleanse this input.
Use: VARCHAR(2000)
.
Official standards support URL's much longer than this, but few modern browsers support URL's over 2,000 characters. See this SO answer: What is the maximum length of a URL in different browsers?
Use: DECIMAL(11,2)
.
It goes up to 11.
Upvotes: 5
Reputation: 2568
Here are some common datatypes I use (I am not much of a pro though):
| Column | Data type | Note
| ---------------- | ------------- | -------------------------------------
| id | INTEGER | AUTO_INCREMENT, UNSIGNED |
| uuid | CHAR(36) | or CHAR(16) binary |
| title | VARCHAR(255) | |
| full name | VARCHAR(70) | |
| gender | TINYINT | UNSIGNED |
| description | TINYTEXT | often may not be enough, use TEXT
instead
| post body | TEXT | |
| email | VARCHAR(255) | |
| url | VARCHAR(2083) | MySQL version < 5.0.3 - use TEXT |
| salt | CHAR(x) | randomly generated string, usually of
fixed length (x)
| digest (md5) | CHAR(32) | |
| phone number | VARCHAR(20) | |
| US zip code | CHAR(5) | Use CHAR(10) if you store extended
codes
| US/Canada p.code | CHAR(6) | |
| file path | VARCHAR(255) | |
| 5-star rating | DECIMAL(3,2) | UNSIGNED |
| price | DECIMAL(10,2) | UNSIGNED |
| date (creation) | DATE/DATETIME | usually displayed as initial date of
a post |
| date (tracking) | TIMESTAMP | can be used for tracking changes in a
post |
| tags, categories | TINYTEXT | comma separated values * |
| status | TINYINT(1) | 1 – published, 0 – unpublished, … You
can also use ENUM for human-readable
values
| json data | JSON | or LONGTEXT
Upvotes: 57
Reputation: 9136
Someone's going to post a much better answer than this, but just wanted to make the point that personally I would never store a phone number in any kind of integer field, mainly because:
In general though, I seem to almost exclusively use:
Of course there are exceptions, but I find that covers most eventualities.
Upvotes: 81
Reputation: 11
I am doing about the same thing, and here's what I did.
I used separate tables for name, address, email, and numbers, each with a NameID column that is a foreign key on everything except the Name table, on which it is the primary clustered key. I used MainName and FirstName instead of LastName and FirstName to allow for business entries as well as personal entries, but you may not have a need for that.
The NameID column gets to be a smallint in all the tables because I'm fairly certain I won't make more than 32000 entries. Almost everything else is varchar(n) ranging from 20 to 200, depending on what you wanna store (Birthdays, comments, emails, really long names). That is really dependent on what kind of stuff you're storing.
The Numbers table is where I deviate from that. I set it up to have five columns labeled NameID, Phone#, CountryCode, Extension, and PhoneType. I already discussed NameID. Phone# is varchar(12) with a check constraint looking something like this: CHECK (Phone# like '[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]'). This ensures that only what I want makes it into the database and the data stays very consistent. The extension and country codes I called nullable smallints, but those could be varchar if you wanted to. PhoneType is varchar(20) and is not nullable.
Hope this helps!
Upvotes: 1
Reputation: 546055
Since you're going to be dealing with data of a variable length (names, email addresses), then you'd be wanting to use VARCHAR. The amount of space taken up by a VARCHAR field is [field length]
+ 1 bytes, up to max length 255, so I wouldn't worry too much about trying to find a perfect size. Take a look at what you'd imagine might be the longest length might be, then double it and set that as your VARCHAR limit. That said...:
I generally set email fields to be VARCHAR(100) - i haven't come up with a problem from that yet. Names I set to VARCHAR(50).
As the others have said, phone numbers and zip/postal codes are not actually numeric values, they're strings containing the digits 0-9 (and sometimes more!), and therefore you should treat them as a string. VARCHAR(20) should be well sufficient.
Note that if you were to store phone numbers as integers, many systems will assume that a number starting with 0 is an octal (base 8) number! Therefore, the perfectly valid phone number "0731602412" would get put into your database as the decimal number "124192010"!!
Upvotes: 9
Reputation: 30555
In my experience, first name/last name fields should be at least 48 characters -- there are names from some countries such as Malaysia or India that are very long in their full form.
Phone numbers and postcodes you should always treat as text, not numbers. The normal reason given is that there are postcodes that begin with 0, and in some countries, phone numbers can also begin with 0. But the real reason is that they aren't numbers -- they're identifiers that happen to be made up of numerical digits (and that's ignoring countries like Canada that have letters in their postcodes). So store them in a text field.
In MySQL you can use VARCHAR fields for this type of information. Whilst it sounds lazy, it means you don't have to be too concerned about the right minimum size.
Upvotes: 16