Petr Shypila
Petr Shypila

Reputation: 1509

Escape url parameters for cURL

I have a url like that:

http://localhost:3000/get_agencies?zipcodecity=&zipcode=30048&city=kraków&

As you can see there city param is equal to kraków. When I pass such URL into curl I receive it somehow encoded in inappropriate way:

curl = curl_easy_init();
// Some code here
curl_easy_setopt(curl, CURLOPT_URL, url);

On the server side I get city=kraków. I tried to use curl_easy_escape(curl, url, strlen(url)); but it just encodes everything. So how can I parse only param values of a query string?

Upvotes: 1

Views: 3162

Answers (2)

hanshenrik
hanshenrik

Reputation: 21513

(sorry, either you significantly edited your original question, or i read it wrong the first time, let me try again)

well, i guess you can kindof repair it, guessing where the data name and value starts and ends based on the = and & characters. it's NOT foolproof, if & or ? is wrongly encoded, or if you encounter an unicode character using the equivalent bytes for their character (edit: this last part is fixable by switching to a unicode string search function), this won't be enough, but except for those 2 scenarios, something like this should work:

std::string patchInappropriatelyEncodedURL(CURL *curl, std::string url){
    size_t pos=url.find("?");
    size_t pos2;
    if(pos==url.npos){
        return url;
    }
    std::string ret=url.substr(0,pos+1);
    std::string tmpstr;
    char *escapedstr;
    url=url.substr(pos+1,url.npos);
    std::string type="=";
    do{
        pos=url.find("=");
        pos2=url.find("&");
        if(pos == url.npos && pos2 == url.npos){
            break;
        }
        if(pos<pos2){
            type="=";
        }else{
            type="&";
            pos=pos2;
        }
        tmpstr=url.substr(0,pos);
        url=url.substr(pos+1,url.npos);
        escapedstr=curl_easy_escape(curl,tmpstr.c_str(),tmpstr.length());
        ret.append(escapedstr);
        ret.append(type);
        curl_free(escapedstr);
    }while(true);
    escapedstr=curl_easy_escape(curl,url.c_str(),url.length());
    ret.append(escapedstr);
    curl_free(escapedstr);
    return ret;
}
  • note that this function is based on guessing, and is not by any means foolproof. i suppose the guessing could improved with a dictionary for your target language or something, though.. but your time would probably be better spent on fixing the bug causing you to receive malformed urls in your program in the first place.

  • i deliberately omitted error checking because i'm lazy. curl_easy_escape can fail (out of memory), and when it does, it returns a nullptr. you should fix that before the code enters production, i'm too lazy.

  • you should put those curl_free's in a finally{} block, else you may encounter memory leaks if the string functions throw exceptions (like substr may throw bad_alloc exceptions), but again, i'm too lazy to fix it.

Upvotes: 1

hanshenrik
hanshenrik

Reputation: 21513

this is why we have curl_easy_escape.

char *escaped_string=curl_easy_escape(ch,"kraków",0);

(however, when the string is known at compile time, you could hardcode the encoded version instead of encoding it at runtime, in this case, the hardcoded version is krak%C3%B3w - your browser's javascript console can be used to figure that out, just write encodeURIComponent("kraków"); to see what the urlencoded version looks like)

gotchas:

  • when the 3rd paramater is 0, curl use strlen() to determine the size. this is safe when using utf8 text, but not safe with binary data. if you're encoding binary data, make sure to specify the length manually, as strlen() will stop once it finds a null byte. (other than that, curl_easy_escape, and urlencoded data is binary safe)

  • don't forget to curl_free(escaped_string); when you're done with it, else you'll end up with memory leaks.

Upvotes: 1

Related Questions