Dolph
Dolph

Reputation: 50710

Appropriate character encoding / collation to store URLs?

My web application stores URL segments in a database. These URL segments are based on user-submitted content.

What collation should I use for character strings that will appear in URLs?

My assumption is ASCII General CI (?) based on this question: Which characters make a URL invalid?

Upvotes: 6

Views: 3590

Answers (2)

Ben
Ben

Reputation: 1660

I would argue Case Sensitivity matters, since you don't want duplicate content from the URLs /home and /Home. These are 2 seperate pages, a mysql query in a _ci collation (select * from page where url='/Home') would return the page regardless of case.

Upvotes: 1

Pekka
Pekka

Reputation: 449783

It doesn't really matter as far as I can see. The characters valid in a URL are represented in any character set I know of, and I wouldn't use different collations between tables and columns - you'll get "illegal mix of collations" problems on any attempt to join them or perform any other kind of cross-column or cross-table operation (see my recent problem here).

Correct me if I'm wrong of course.

Upvotes: 3

Related Questions