Reputation: 25
I have a textarea element on a page and it's content is saved to my database upon clicking the save button I created. I wanted to create short urls with hash ids like "MySite.com/laHquq" using the unique id (primary key) of the table row where I saved my record containing the information that was in the textarea and this: http://www.hashids.org/ which will "Generate short hashes from numbers (like YouTube and Bitly)." I want to use it to obfuscate the unique primary key for table row with the record containing information from the textarea.
I'm going to have a script place the hash id created (which is obfuscating a primary key) after the forward slash at the end of my domain name in the address bar so the address bar will now have: "MySite.com/laHquq" after the information is saved to my database. This will simply be done to indicate that now the information that was saved in the textarea can be seen again by visiting the site with the hash id after the forward slash.
I will also have a script with a self invoking function that will get the url from the address bar each time the page loads and check for a hash id after the forward slash then use the hash id to find the right information from the db to display in the textarea on the page. I wondered if using hashids: http://www.hashids.org/ will help prevent hash collisions.
Upvotes: 2
Views: 1639
Reputation: 1
Found a collision.
'main' => [
'salt' => 'KorvpalliSuuruneTennisePall666',
'length' => '8',
'alphabet' => 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890',
],
Hashids::encode(250) results DoGxYxgJ
Hashids::encode(294) results DoGxYxgJ
Upvotes: -1
Reputation: 21
Showing empiric results with PHP here. We have tested with minimum of 5 chars, with this salt as shown below:
$hashids = new Hashids\Hashids('this is my salt', 5, 'BCDFGHJKLMNPQRSTVWXYZ0123456789');
With a process running 24x7 for a week in a loop to fill rows in a MySQL DB, hashing PK from 1, in a table like this:
create table hashids (
id int NOT NULL AUTO_INCREMENT primary key,
hash varchar(255)
);
with the UNIQUE index in the hash and controlling both the ON DUPLICATE KEY during the script and running a SELECT DISTINCT at the end of the process for sanity check.
We stopped the process at
select count(*) from hashids;
+-----------+
| count(*) |
+-----------+
| 355325777 |
+-----------+
Then we decided to start from close to the upper limits of signed BIGINT.
ID: 9223372036854775000-> HASH: RQ0ZPNPPPZ6Q7RNV
ID: 9223372036854775329-> HASH: YN2K8Y888K7NW6VY
ID: 9223372036854775654-> HASH: 2MQ0474440VM8QMY
ID: 9223372036854775777-> HASH: 7L25R7RRR5ZL820W
ID: 9223372036854775805-> HASH: 020WV7VVVWX250YM
ID: 9223372036854775807-> HASH: QVMZYRYYYZXVLM0W
In both cases, after running for days and filling 15Gb of ids, the hashid stood.
We confirm no collisions found so far.
This test is well beyond the limits of our application, so we considered Hashids safe for us to use. Of course, as in Maths, an empiric result doesn't demonstrate the law.
Also bear in mind that Hashids has an upper limit, at least with PHP, before you reach the signed/unsigned BIGINT MySQL DB limit or PHP_INT_MAX.
Upvotes: 0
Reputation: 113994
From the documentation it looks like you'll never face collisions with hashid. That's because it's not a hash. It's a cipher - an encryption algorithm. A really weak one though but good enough to generate ids that look like hash.
One key clue is that there's a decrypt
function. Real hashes, ones that can collide, cannot be decrypted into a single value because there are multiple values (usually infinite) that generates the same hash.
In some ways it's similar to base64 encoding but with a character set chosen to be URL friendly (no +
or /
).
Upvotes: 5