Reputation: 2805
I'm creating a link shortening service and I'm using base64 encoding/decoding of an incremented ID field to create my urls. A url with the ID "6" would be: http://mysite.com/Ng==
I need to also allow users to create a custom url name, like http://mysite.com/music
Here's my (possibly faulty) approach so far. Help in fixing it would be appreciated.
When someone creates a new link:
When someone creates a new link and passes a custom short URL:
Is there a better encoding method that will let me turn any number into a short string, and any string into a number, so I can always lookup short urls (whether custom or autogenerated) by turning the name into a number and querying for a link with an ID equal to that number?
Upvotes: 13
Views: 12424
Reputation: 66681
First and foremost, make sure you have unicity constraints in place on the ID
and short_url_code
columns.
When someone creates a new link:
ID
from the database (for performance reasons you should really REALLY use autoincrement
or SEQUENCE
, depending on what your RDBMS offers; otherwise go ahead and select MAX(ID)+1
)http://website.com/[short url name]
) from ID
using base64_encode
or any other custom or standard encoding schemelinks
table: ID, short_url_code, destination_url
If the insert fails because of a constraint violation go back to step 1 to try a new ID
; you may have had a violation because:
autoincrement
or SEQUENCE
, and may happen quite often otherwise), and/orshort_url_code
has already been used as a custom URL (this will happen very seldomly unless someone is trying to cause trouble on your site)If the insert succeeded, commit and return the short URL to the user
When someone creates a new link and passes a custom short URL:
ID
as in step 2 above, use the custom short_url_code
provided by the userID
: go back to step 1 to try a new ID
short_url_code
: return an error to the user asking him to pick a different custom URL, as the short URL he/she provided has already been usedUpvotes: 12
Reputation: 67019
base64 can be used to make short urls, but it can also make the url longer. For instance the base64_encode of the number 1 is 'MQ==' which is 4 times the size. Base64 will always have 2 characters to obtain the 64bits, which is not ideal for short urls.
If size is the most important factor then you maybe able to produce the shortest urls by relying on internationalization.
This can make a URI rather long (up to 9 ASCII characters for a single Unicode character), but the intention is that browsers only need to display the decoded form, and many protocols can send UTF-8 without the %HH escaping.
Keep in mind that Browsers work quite well with UTF-8, and twitter will have no trouble with these urls.
Upvotes: 2