Reputation: 429
I really want to know how web servers convert URL UTF-8 encoded characters to unicode.How do they solve problems such as duplicate URL encoding and non-shortest form utf-8 codes conversion such that explained here.
for example: http://www.example.com/dir1/index.html?name=%D8%A7%D9%84%D8%A7%D8%B3%D9%85%D8%A7
to http://www.example.com/dir1/index.html?name=الاسما
I wrote a c++ program that does this conversion but in general I want to know how web servers like apache or nginx do this.
Upvotes: 1
Views: 200
Reputation: 533
You meant doing something like this:
From - Encode/Decode URLs in C++
#include <string>
#include <iostream>
using std::string;
using std::cout;
using std::cin;
string urlDecode(string &SRC) {
string ret;
char ch;
int i, ii;
for (i=0; i<SRC.length(); i++) {
if (int(SRC[i])=='%') {
sscanf(SRC.substr(i+1,2).c_str(), "%x", &ii);
ch=static_cast<char>(ii);
ret+=ch;
i=i+2;
} else {
ret+=SRC[i];
}
}
return (ret);
}
int main()
{
string s = "http://www.example.com/dir1/index.html?name=%D8%A7%D9%84%D8%A7%D8%B3%D9%85%D8%A7";
cout << urlDecode(s);
}
Upvotes: 1