Vadim Samokhin
Vadim Samokhin

Reputation: 3446

url-encoding in browser address bar

When I put some non-alpha-numeric symbols in browser address bar, they got url-encoded. For example, https://www.php.net/manual-lookup.php?pattern=привет turns into https://www.php.net/manual-lookup.php?pattern=%EF%F0%E8%E2%E5%F2.

The question is: what do those two percent-prefixed hex digits mean?

Upvotes: 1

Views: 2403

Answers (1)

bmargulies
bmargulies

Reputation: 99993

they are bytes of the Windows 1251 encoding of Cyrillic. Since there are only six of them, they can't be UTF-8, since it takes 12 bytes of UTF-8 for 6 chars of Cyrillic.

The code chart for CP1251 can be found here: http://en.wikipedia.org/wiki/Windows-1251.

Just like 20 is hex for a space, each of the Cyrillic characters has its numeric value, expressible as two hex digits.

Upvotes: 2

Related Questions