Reputation: 1213
In my MySQL InnoDB database, I have dirty zip code data that I want to clean up.
The clean zip code data is when I have all 5 digits for a zip code (e.g. "90210").
But for some reason, I noticed in my database that for zipcodes that start with a "0", the 0 has been dropped.
So "Holtsville, New York" with zipcode "00544
" is stored in my database as "544
"
and
"Dedham, MA" with zipcode "02026
" is stored in my database as "2026
".
What SQL can I run to front pad "0" to any zipcode that is not 5 digits in length? Meaning, if the zipcode is 3 digits in length, front pad "00". If the zipcode is 4 digits in length, front pad just "0".
UPDATE:
I just changed the zipcode to be datatype VARCHAR(5)
Upvotes: 99
Views: 133757
Reputation: 131
I know this is well after the OP. One way you can go with that keeps the table storing the zipcode data as an unsigned INT but displayed with zeros is as follows.
select LPAD(cast(zipcode_int as char), 5, '0') as zipcode from table;
While this preserves the original data as INT and can save some space in storage you will be having the server perform the INT to CHAR conversion for you. This can be thrown into a view and the person who needs this data can be directed there vs the table itself.
Upvotes: 5
Reputation: 28172
Store your zipcodes as CHAR(5) instead of a numeric type, or have your application pad it with zeroes when you load it from the DB. A way to do it with PHP using sprintf()
:
echo sprintf("%05d", 205); // prints 00205
echo sprintf("%05d", 1492); // prints 01492
Or you could have MySQL pad it for you with LPAD()
:
SELECT LPAD(zip, 5, '0') as zipcode FROM table;
Here's a way to update and pad all rows:
ALTER TABLE `table` CHANGE `zip` `zip` CHAR(5); #changes type
UPDATE table SET `zip`=LPAD(`zip`, 5, '0'); #pads everything
Upvotes: 228
Reputation: 9372
You need to decide the length of the zip code (which I believe should be 5 characters long). Then you need to tell MySQL to zero-fill the numbers.
Let's suppose your table is called mytable
and the field in question is zipcode
, type smallint
. You need to issue the following query:
ALTER TABLE mytable CHANGE `zipcode` `zipcode`
MEDIUMINT( 5 ) UNSIGNED ZEROFILL NOT NULL;
The advantage of this method is that it leaves your data intact, there's no need to use triggers during data insertion / updates, there's no need to use functions when you SELECT
the data and that you can always remove the extra zeros or increase the field length should you change your mind.
Upvotes: 21
Reputation: 13644
you should use UNSIGNED ZEROFILL
in your table structure.
Upvotes: 3
Reputation: 307
LPAD works with VARCHAR2 as it does not put spaces for left over bytes. LPAD changes leftover/null bytes to zeros on LHS SO datatype should be VARCHAR2
Upvotes: 0
Reputation: 4399
CHAR(5)
or
MEDIUMINT (5) UNSIGNED ZEROFILL
The first takes 5 bytes per zip code.
The second takes only 3 bytes per zip code. The ZEROFILL option is necessary for zip codes with leading zeros.
Upvotes: 3
Reputation: 6669
It would still make sense to create your zip code field as a zerofilled unsigned integer field.
CREATE TABLE xxx (
zipcode INT(5) ZEROFILL UNSIGNED,
...
)
That way mysql takes care of the padding for you.
Upvotes: 3
Reputation: 15679
Ok, so you've switched the column from Number to VARCHAR(5). Now you need to update the zipcode field to be left-padded. The SQL to do that would be:
UPDATE MyTable
SET ZipCode = LPAD( ZipCode, 5, '0' );
This will pad all values in the ZipCode column to 5 characters, adding '0's on the left.
Of course, now that you've got all of your old data fixed, you need to make sure that your any new data is also zero-padded. There are several schools of thought on the correct way to do that:
Handle it in the application's business logic. Advantages: database-independent solution, doesn't involve learning more about the database. Disadvantages: needs to be handled everywhere that writes to the database, in all applications.
Handle it with a stored procedure. Advantages: Stored procedures enforce business rules for all clients. Disadvantages: Stored procedures are more complicated than simple INSERT/UPDATE statements, and not as portable across databases. A bare INSERT/UPDATE can still insert non-zero-padded data.
Handle it with a trigger. Advantages: Will work for Stored Procedures and bare INSERT/UPDATE statements. Disadvantages: Least portable solution. Slowest solution. Triggers can be hard to get right.
In this case, I would handle it at the application level (if at all), and not the database level. After all, not all countries use a 5-digit Zipcode (not even the US -- our zipcodes are actually Zip+4+2: nnnnn-nnnn-nn) and some allow letters as well as digits. Better NOT to try and force a data format and to accept the occasional data error, than to prevent someone from entering the correct value, even though it's format isn't quite what you expected.
Upvotes: 12