jorm
jorm

Reputation: 131

Why do websites generate random alphanumeric strings for urls instead of using row ids?

Why does many sites (youtube is good example) generate string of random number and letter instead of using for example the row id?

usually its something likes this

bla?v=wli4l73Chc0

instead of like

bla?id=83934

Is it just to keep it short if you have many rows? Or is there other good things about this? Because i can imagine: bla?id=23934234234 dont look so nice

Thanks and cheers

Upvotes: 13

Views: 5695

Answers (7)

Strelok
Strelok

Reputation: 51461

They are actually not random strings. Normally they are numbers (usually row IDs) that are encoded in Base-36 encoding (obviously not always the case, but there are many that use it).

Why do they use it? Because a Base-36 encoded number string is shorter than the original.

For example: 1234567890 in Base-36 is kf12oi, almost 50% shorter.

See this Wikipedia article. Check the "Uses in practice" section to see who is using it.

Upvotes: 9

Don Roby
Don Roby

Reputation: 41137

I upvoted Rob's answer, but I'll also elaborate a bit on one of the risks.

If you publish a link like Why do websites generate random alphanumeric strings for urls instead of using row ids? where 258510 is a database id someone trying to hack your site is going to try connecting to https://stackoverflow.com/questions/2581511.

With stackoverflow, this may not be a database id, and the questions on stackoverflow are not supposed to be private, so it's not a big deal even if it is.

But if this were a site where restricting data access to owners of the data were important, this potentially risks letting people see data they shouldn't.

There are of course things you can and should do to make it refuse to show the data if they don't own it, but it's still better to make the url not identify a database id. It's better, as Rob noted, to have a hash into some much larger domain, or an session-based index into a set of data already identified as appropriate to show the user and available only within a logged-in session.

Upvotes: 4

Dillie-O
Dillie-O

Reputation: 29745

Some environments also use this to establish state variables for the session. For example, if you have an ASP.Net app that is using cookieless sessions, you'll find a similar code in the URL.

Upvotes: 0

Earlz
Earlz

Reputation: 63865

I honestly am not sure why they wouldn't use the unique ID (or ObjectID or whatever depending on what database) so have you ever wondered if rather than representing the ID in base-10, they represented it in a higher base (such as 64, or whatever is capable within URLs) so that the ID is more compact on the query string? (read: wli4l73Chc0 is some number in non-base-10)

Upvotes: 4

Rob Lachlan
Rob Lachlan

Reputation: 14469

Having raw row ids, or other unmodified database parameters in urls, is bad security practice. Far better to have hashes into some large domain.

Upvotes: 1

Yaroslav
Yaroslav

Reputation: 2736

in distributed environment it is simpler to generate random numbers for identifiers than sequential numbers.

Upvotes: 6

Alex
Alex

Reputation: 3652

I would guess it's to obfuscate information and to add/increase the amount of information that can be passed via that parameter.

Upvotes: 3

Related Questions