kasandell
kasandell

Reputation: 87

C++ CURL not retrieving webpage properly

I have the following three methods in my class -

void WebCrawler::crawl()
{
    urlQueue.push("http://www.google.com/");
    if(!urlQueue.empty())
    {
        std::string url = urlQueue.front();
        urlQueue.pop();
        pastURLs.push_back(url);
        if(pastURLs.size()>4000000)
        {
            pastURLs.erase(pastURLs.begin());
        }
        std::string data=getData(url);
        auto newPair= std::pair<std::string, std::string>(url, data);
        dataQueue.push(newPair);
    }

}

std::string WebCrawler::getData(std::string URL)
{
    std::string readBuffer = "";
    CURL *curl = curl_easy_init();

    if(curl)
    {
    CURLcode res;
    curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, &WebCrawler::WiteCallback);
    curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
    curl_easy_setopt(curl, CURLOPT_URL, URL.c_str());
    res = curl_easy_perform(curl);
    curl_easy_cleanup(curl);
    }
    return readBuffer;
}

and

size_t WebCrawler::WiteCallback(char* buf, size_t size, size_t nmemb, void* up)
{

    ((std::string*)up)->append((char*)buf, size * nmemb);
    return size * nmemb;
}

When I take these methods out of my class and run them as functions, my code executes properly and returns the webpage contents. However, as soon as I put these methods into my class they begin to behave differently. When my WriteCallback is called, the program fails and says it could not allocate 45457340335435776 bytes of data. I'm a little baffled as to what is causing this change and any help would be much appreciated.

Upvotes: 0

Views: 57

Answers (1)

Kijewski
Kijewski

Reputation: 26043

WebCrawler::WiteCallback is a non-static method, which means that the pointer to the object (this) needs to be passed. Depending on the ABI this can be an implicit parameter, a register that is not used for normal argument passing, or anything else. For your ABI it looks like the object is passed as the leftmost parameter ("(WebCrawler *this, char* buf, size_t size, size_t nmemb, void* up)").

You must not do that. Either make WebCrawler::WiteCallback static or use a trampoline:

size_t WebCrawler::WriteCallbackTramp(char* buf, size_t size,
                                      size_t nmemb, void* up)
{
    return ((WebCrawler*) up)->WriteCallback(buf, size, nmemb);
}

where WebCrawler contains a member for the buffer.

Making the method static is the better solution.

C.f. Wikipedia: Calling convention

Upvotes: 2

Related Questions