Sean Clark
Sean Clark

Reputation: 1456

PHP cURL is holding onto session cookies

So this is pretty strange. I'm making a cURL login script and I need my cookies to be exactly what I send to cURL. But it seems cURL is holding on in memory to old cookies, even AFTER the cookie file is completely truncated.

In the output below, notice that I'm trying to hit amazon.co.uk but my previous cookies from amazon.com are still at the top of the cookie file. And that is enough to make Amazon not keep my logged in. So I need fresh cookies.

Now, this first cookie output you see is the starting point. Which is what happens Everytime I refresh the page. Even though The final cookie output is not this. It's just the amazon.co.uk cookies.

The reason the other cookies are "working" is because on the login page, (where you see "need clean session". I'm calling curl_setopt($ch, CURLOPT_COOKIESESSION, true); Which properly sends ONLY the cookies from the file.

But AFTER that point, when i'm not calling that anymore, it goes back to these stored cookies. I can't call COOKIESESSION every time, because I need it remember the cookies from request to request.

So basically, i just need a way to clear out cURLs memory of the cookies, or something along that line. And before you say "delete the file" you can see from the output that I am clearing out the cookie file.

string(125706) "
string(78) "https://sellercentral.amazon.co.uk/gp/fba/core/data/collections/shipments.html"
string(12) "need to init"
string(1704) "# Netscape HTTP Cookie File
# http://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

.amazon.com TRUE    /   FALSE   1429875358  session-id-time 1429858800l
.amazon.com TRUE    /   FALSE   1429875358  session-id  181-0028548-4275101
.amazon.com TRUE    /   FALSE   1429875532  ubid-main   191-9297218-7050950
.amazon.com TRUE    /   FALSE   1429875358  session-token   cL5vcznqgzk2RwhZIFZjSepKiznVnNcdv1Uh/FiLV8i0QuxpPEEx5D94imjktXu69QOdfQuQX8chNhvB8sR9KI4ZgJBWWlMnFOepyO6/+wtH9GOtH+1WMZQKHp8fqGJlpMtT8XMwKUx+hnuYRPnheq54s5Q1fQX5HJ4wS3KE4UVHAady2H4ugSsIi+O33zL1d3eWN4TnbX4nxiHqIqFs4Q8GGCYVEwOrbcB1KH3FCohbrwQPXNN7igf6jQXI++h0N0dJTv781sU=
.amazon.com TRUE    /   FALSE   2059990558  x-main  "i6iLU3A?45qEpvgw@NNzGTsxqqOvwryX"
.amazon.com TRUE    /   TRUE    2059990557  at-main 5|7HZSLL/JbN/aGiGYXo/uxjxNFyLucyEmxBCKkR4QoU06R5NF4I1eNekoJpsyE2hkx5FrSI3dP5DuaolT5D48jdz6NLwDmYdKzovka+5DJTHuRuVmzBVVkW2g40uhZlRlaHJmewKWCjmoyi+azkQswRDRmfyAICX+hBrRfUwJRwQqeOhQGc6dujYHDBiv8nxcQFciY9G+7au3zYAGof+CepYeiWk4xuQmBLobVAci10frgDxdgV7OdJOSVaHz2UtykTQ+F4V4hNzFwclsv9ranLMSM5KH9tys
.amazon.com TRUE    /   TRUE    0   sess-at-main    "GR5GAfuX5U+vC3ayUz3LIUs7+o414SBlsEA1rVMyvvA="
.amazon.com TRUE    /   FALSE   2059990557  lc-main en_US
.amazon.co.uk   TRUE    /   FALSE   1429875575  session-id-time 1429858800l
.amazon.co.uk   TRUE    /   FALSE   1429875575  session-id  276-1602919-0207204
.amazon.co.uk   TRUE    /   FALSE   1429876234  ubid-acbuk  277-6716334-7531852
.amazon.co.uk   TRUE    /   FALSE   1429875881  session-token   gVQymTdZsxCD0I/aObEZCLmujDKZGjQ48lGc34xaW6i45XVIonC1YK014YrFqVvNG2qurp1xmGrtCHcuVQx2tSQ7LlYpr+srdgyKvj/pCcW6CxR0azqQsU9wYW3BxXqZnQDQnqVmYaGpY0eB19BOTShppMKGnPhzMkgy/UFVuoeGsngx0tz8iWFMy6qTZFqibPoMvFmpsdsL8GhbVn6sy++vUUBeQhVgyzktWEfjRXdzZw32t/SOCA==
"
string(21) "unexpected login page"
string(78) "https://sellercentral.amazon.co.uk/gp/fba/core/data/collections/shipments.html"
string(12) "need to init"
string(0) ""
string(25) "no cookies, need to login"
string(51) "https://sellercentral.amazon.co.uk/gp/homepage.html"
string(0) ""
string(28) "on login, need clean session"
string(44) "https://sellercentral.amazon.co.uk/ap/widget"
string(270) "# Netscape HTTP Cookie File
# http://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

.amazon.co.uk   TRUE    /   FALSE   1429876317  session-id-time 1429858800l
.amazon.co.uk   TRUE    /   FALSE   1429876317  session-id  278-1385775-5645645
"
string(112) "https://sellercentral.amazon.co.uk/gp/utilities/set-rainier-prefs.html?ie=UTF8&url=&marketplaceID=A1F83G8C2ARO7P"
string(1212) "# Netscape HTTP Cookie File
# http://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

.amazon.co.uk   TRUE    /   FALSE   1429876318  session-id-time 1429858800l
.amazon.co.uk   TRUE    /   FALSE   1429876318  session-id  278-1385775-5645645
.amazon.co.uk   TRUE    /   FALSE   1429876318  ubid-acbuk  279-4986453-7111520
.amazon.co.uk   TRUE    /   FALSE   1429876318  session-token   YpneIOOGKiqQ8x/E/soTTmUAym3tXUWGtjXKYWnAONOkcHENmQxMDD3zTWjgtLN9b/em0xBTPoYMpECUcR38rZlf2Vu1a2TOBNsi2hpTjageCvIM9noPlEq0TBrgdOEfGl354j0+dIfTHM4ObUF2nzY2UBubZoi3X77MBcpLel+rjjCFeTCwhmNFbru5dyalIRn1UyVAdsB3PIEk+saDDbf2HRMUFP7hdaCaBhKwb5tpyvpA1xrk2XJXm2dre2FE1MKsgWFwt1c=
.amazon.co.uk   TRUE    /   FALSE   2059991518  x-acbuk 3IkDIKmc71d9lKFefDy7ATw1QKYl8545
.amazon.co.uk   TRUE    /   TRUE    2059991518  at-acbuk    "5|/QlP2Fp+YlPLm1O0znctkujc6sMDGnEGxbqVjtrNehg2P98QG1vCFOkKxChCaUJzPmQSS4C/87WM0XC30721BVwFLpKRa9FIS9sUtlZJh8m07RHhC2vBspsYjZ710LfM/cHCHKXdBmXlHZ8CLNO55ff4oYRI5NnaFKu8dx2xSBdwAzYydTqlQhrOKE0RAolHBJgIVngWDlw42kDY79FOciZP7ray/qSR/eceAPfJfzIV0t/vKC/vWpNlOQBs/FTmvWmEMZtSoAUWlgPeIiUw+g=="
.amazon.co.uk   TRUE    /   TRUE    0   sess-at-acbuk   "9EziH1irfB0flBfODA2zw+lVgvo4OmENH4XM3kxEnpg="
.amazon.co.uk   TRUE    /   FALSE   2059991518  lc-acbuk    en_US

Upvotes: 0

Views: 1770

Answers (1)

Misunderstood
Misunderstood

Reputation: 5665

Update

When there is trouble, I often set FOLLOWLOCATION to false:

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);

If there is a Redirect you can see what is happening and need get cookies that are set in the Redirected URL's Response Header then FOLLOWLOCATION must be set to false.

When the curl URL takes you to a redirect curl_getinfo will get the redirect location URL.

$status = intval(curl_getinfo($ch,CURLINFO_HTTP_CODE));
if ($status > 299 && $status < 400){
  $url= curl_getinfo($ch,CURLINFO_REDIRECT_URL );
}
// update cookies, do not clear `cookies()`;

When it gets tough I uses these options to get both Response and Response Headers. The Response Header will be return in the curl_exec() data. The Request Header will be return by curl_getinfo()

curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $request);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
curl_setopt($ch, CURLOPT_HEADER, true);


$data = curl_exec($ch);
if (curl_errno($ch)){
    $data .= 'Retreive Base Page Error: ' . curl_error($ch);
}
else {
  $info = rawurldecode(var_export(curl_getinfo($ch),true));
  $data = curl_exec($ch);
  $skip = intval(curl_getinfo($ch, CURLINFO_HEADER_SIZE)); 
  $requestHeader= substr($data,0,$skip);
  $data =  substr($data,$skip);
  $filename = parse_url($url, PHP_URL_HOST);
  $filename .= parse_url($url, PHP_URL_PATH) . '.txt';
  $fp = fopen($filename,'w');
  fwrite($fp,$info\n$data");
  fclose($fp);

  $data =  substr($data,$skip);
}

Both header and the HTML are stored in the file. You can then view both HTTP Headers, the HTML and JavaScript. Sometimes cookies are set by JavaScript document.cookie, or the page redirected with window.location, or an HTML form's submit button is clicked with JS. In these cases it may be necessary to scrape the cookies and or redirect location from the curl data.


Then I use FireFox Inspector or Chrome Development Tool.

I go to the Network Tab

In FireFox I go to Settings and turn on "Enable Persistent logs"
In Chrome I click "Preserve log" on the Network Tab

Then I use the Browser to go wherever I want curl to go.

Now I can see every Request and Response including redirects and compare them with the save headers.


When you need the header to look exactly like the saved Browser headers:

Create an array to put the Request Header Key Values
Fill in the Request array with exactly what is in the Request header of your upload.
EXAMPLE:

$request = array();
$request[] = "Host: www.example.com";
$request[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$request[] = "User-Agent: MOT-V9mm/00.62 UP.Browser/6.2.3.4.c.1.123 (GUI) MMP/2.0";
$request[] = "Accept-Language: en-US,en;q=0.5";
$request[] = "Connection: keep-alive";
$request[] = "Cache-Control: no-cache";
$request[] = "Pragma: no-cache";

Add to curl:

curl_setopt($ch, CURLOPT_HTTPHEADER, $request);

Many times it is much easier to use a mobile version. Many times the desktop version page requires JavaScript and the mobile version does not. I use FireFox with user agent switcher using an old Motorola user agent to retrieve the headers and HTML. Then I use the same user agent in curl's HTTPHEADER:

request[] = 'User-Agent: MOT-V9mm/00.62 UP.Browser/6.2.3.4.c.1.123 (GUI) MMP/2.0

end of update


I find curl's cookie jar problematic so I wrote my own routine.
For this CURLOPT_HEADER must be true.

 curl_setopt($ch, CURLOPT_HEADER, true);

  $data = curl_exec($ch);
  $skip = intval(curl_getinfo($ch, CURLINFO_HEADER_SIZE)); 
  $requestHeader= substr($data,0,$skip);
  $data =  substr($data,$skip);
  $e = 0;
  while(true){
    $s = strpos($requestHeader,'Set-Cookie: ',$e);
    if (!$s){break;}
    $s += 12;
    $e = strpos($requestHeader,';',$s);
    $cookie = substr($requestHeader,$s,$e-$s) ;
    $s = strpos($cookie,'=');
    $key = substr($cookie,0,$s);
    $value = substr($cookie,$s);
    $cookies[$key] = $value;
  }

Then to use the $cookies[]:

 $cookie = '';
 $show = '';
 $delim = '';
 foreach ($cookies as $k => $v){
   $cookie .= "$delim$k$v";
   $delim = '; ';
 }

Then use $cookie:

curl_setopt($ch, CURLOPT_COOKIE, $cookie );

Upvotes: 1

Related Questions