Reputation: 1456
So this is pretty strange. I'm making a cURL login script and I need my cookies to be exactly what I send to cURL. But it seems cURL is holding on in memory to old cookies, even AFTER the cookie file is completely truncated.
In the output below, notice that I'm trying to hit amazon.co.uk
but my previous cookies from amazon.com
are still at the top of the cookie file. And that is enough to make Amazon not keep my logged in. So I need fresh cookies.
Now, this first cookie output you see is the starting point. Which is what happens Everytime I refresh the page. Even though The final cookie output is not this. It's just the amazon.co.uk
cookies.
The reason the other cookies are "working" is because on the login page, (where you see "need clean session". I'm calling curl_setopt($ch, CURLOPT_COOKIESESSION, true);
Which properly sends ONLY the cookies from the file.
But AFTER that point, when i'm not calling that anymore, it goes back to these stored cookies. I can't call COOKIESESSION every time, because I need it remember the cookies from request to request.
So basically, i just need a way to clear out cURLs memory of the cookies, or something along that line. And before you say "delete the file" you can see from the output that I am clearing out the cookie file.
string(125706) "
string(78) "https://sellercentral.amazon.co.uk/gp/fba/core/data/collections/shipments.html"
string(12) "need to init"
string(1704) "# Netscape HTTP Cookie File
# http://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
.amazon.com TRUE / FALSE 1429875358 session-id-time 1429858800l
.amazon.com TRUE / FALSE 1429875358 session-id 181-0028548-4275101
.amazon.com TRUE / FALSE 1429875532 ubid-main 191-9297218-7050950
.amazon.com TRUE / FALSE 1429875358 session-token cL5vcznqgzk2RwhZIFZjSepKiznVnNcdv1Uh/FiLV8i0QuxpPEEx5D94imjktXu69QOdfQuQX8chNhvB8sR9KI4ZgJBWWlMnFOepyO6/+wtH9GOtH+1WMZQKHp8fqGJlpMtT8XMwKUx+hnuYRPnheq54s5Q1fQX5HJ4wS3KE4UVHAady2H4ugSsIi+O33zL1d3eWN4TnbX4nxiHqIqFs4Q8GGCYVEwOrbcB1KH3FCohbrwQPXNN7igf6jQXI++h0N0dJTv781sU=
.amazon.com TRUE / FALSE 2059990558 x-main "i6iLU3A?45qEpvgw@NNzGTsxqqOvwryX"
.amazon.com TRUE / TRUE 2059990557 at-main 5|7HZSLL/JbN/aGiGYXo/uxjxNFyLucyEmxBCKkR4QoU06R5NF4I1eNekoJpsyE2hkx5FrSI3dP5DuaolT5D48jdz6NLwDmYdKzovka+5DJTHuRuVmzBVVkW2g40uhZlRlaHJmewKWCjmoyi+azkQswRDRmfyAICX+hBrRfUwJRwQqeOhQGc6dujYHDBiv8nxcQFciY9G+7au3zYAGof+CepYeiWk4xuQmBLobVAci10frgDxdgV7OdJOSVaHz2UtykTQ+F4V4hNzFwclsv9ranLMSM5KH9tys
.amazon.com TRUE / TRUE 0 sess-at-main "GR5GAfuX5U+vC3ayUz3LIUs7+o414SBlsEA1rVMyvvA="
.amazon.com TRUE / FALSE 2059990557 lc-main en_US
.amazon.co.uk TRUE / FALSE 1429875575 session-id-time 1429858800l
.amazon.co.uk TRUE / FALSE 1429875575 session-id 276-1602919-0207204
.amazon.co.uk TRUE / FALSE 1429876234 ubid-acbuk 277-6716334-7531852
.amazon.co.uk TRUE / FALSE 1429875881 session-token gVQymTdZsxCD0I/aObEZCLmujDKZGjQ48lGc34xaW6i45XVIonC1YK014YrFqVvNG2qurp1xmGrtCHcuVQx2tSQ7LlYpr+srdgyKvj/pCcW6CxR0azqQsU9wYW3BxXqZnQDQnqVmYaGpY0eB19BOTShppMKGnPhzMkgy/UFVuoeGsngx0tz8iWFMy6qTZFqibPoMvFmpsdsL8GhbVn6sy++vUUBeQhVgyzktWEfjRXdzZw32t/SOCA==
"
string(21) "unexpected login page"
string(78) "https://sellercentral.amazon.co.uk/gp/fba/core/data/collections/shipments.html"
string(12) "need to init"
string(0) ""
string(25) "no cookies, need to login"
string(51) "https://sellercentral.amazon.co.uk/gp/homepage.html"
string(0) ""
string(28) "on login, need clean session"
string(44) "https://sellercentral.amazon.co.uk/ap/widget"
string(270) "# Netscape HTTP Cookie File
# http://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
.amazon.co.uk TRUE / FALSE 1429876317 session-id-time 1429858800l
.amazon.co.uk TRUE / FALSE 1429876317 session-id 278-1385775-5645645
"
string(112) "https://sellercentral.amazon.co.uk/gp/utilities/set-rainier-prefs.html?ie=UTF8&url=&marketplaceID=A1F83G8C2ARO7P"
string(1212) "# Netscape HTTP Cookie File
# http://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
.amazon.co.uk TRUE / FALSE 1429876318 session-id-time 1429858800l
.amazon.co.uk TRUE / FALSE 1429876318 session-id 278-1385775-5645645
.amazon.co.uk TRUE / FALSE 1429876318 ubid-acbuk 279-4986453-7111520
.amazon.co.uk TRUE / FALSE 1429876318 session-token YpneIOOGKiqQ8x/E/soTTmUAym3tXUWGtjXKYWnAONOkcHENmQxMDD3zTWjgtLN9b/em0xBTPoYMpECUcR38rZlf2Vu1a2TOBNsi2hpTjageCvIM9noPlEq0TBrgdOEfGl354j0+dIfTHM4ObUF2nzY2UBubZoi3X77MBcpLel+rjjCFeTCwhmNFbru5dyalIRn1UyVAdsB3PIEk+saDDbf2HRMUFP7hdaCaBhKwb5tpyvpA1xrk2XJXm2dre2FE1MKsgWFwt1c=
.amazon.co.uk TRUE / FALSE 2059991518 x-acbuk 3IkDIKmc71d9lKFefDy7ATw1QKYl8545
.amazon.co.uk TRUE / TRUE 2059991518 at-acbuk "5|/QlP2Fp+YlPLm1O0znctkujc6sMDGnEGxbqVjtrNehg2P98QG1vCFOkKxChCaUJzPmQSS4C/87WM0XC30721BVwFLpKRa9FIS9sUtlZJh8m07RHhC2vBspsYjZ710LfM/cHCHKXdBmXlHZ8CLNO55ff4oYRI5NnaFKu8dx2xSBdwAzYydTqlQhrOKE0RAolHBJgIVngWDlw42kDY79FOciZP7ray/qSR/eceAPfJfzIV0t/vKC/vWpNlOQBs/FTmvWmEMZtSoAUWlgPeIiUw+g=="
.amazon.co.uk TRUE / TRUE 0 sess-at-acbuk "9EziH1irfB0flBfODA2zw+lVgvo4OmENH4XM3kxEnpg="
.amazon.co.uk TRUE / FALSE 2059991518 lc-acbuk en_US
Upvotes: 0
Views: 1770
Reputation: 5665
When there is trouble, I often set FOLLOWLOCATION
to false:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
If there is a Redirect you can see what is happening and need get cookies that are set in the Redirected URL's Response Header then FOLLOWLOCATION
must be set to false.
When the curl URL takes you to a redirect curl_getinfo
will get the redirect location URL.
$status = intval(curl_getinfo($ch,CURLINFO_HTTP_CODE));
if ($status > 299 && $status < 400){
$url= curl_getinfo($ch,CURLINFO_REDIRECT_URL );
}
// update cookies, do not clear `cookies()`;
When it gets tough I uses these options to get both Response and Response Headers. The Response Header will be return in the curl_exec()
data. The Request Header will be return by curl_getinfo()
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $request);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
curl_setopt($ch, CURLOPT_HEADER, true);
$data = curl_exec($ch);
if (curl_errno($ch)){
$data .= 'Retreive Base Page Error: ' . curl_error($ch);
}
else {
$info = rawurldecode(var_export(curl_getinfo($ch),true));
$data = curl_exec($ch);
$skip = intval(curl_getinfo($ch, CURLINFO_HEADER_SIZE));
$requestHeader= substr($data,0,$skip);
$data = substr($data,$skip);
$filename = parse_url($url, PHP_URL_HOST);
$filename .= parse_url($url, PHP_URL_PATH) . '.txt';
$fp = fopen($filename,'w');
fwrite($fp,$info\n$data");
fclose($fp);
$data = substr($data,$skip);
}
Both header and the HTML are stored in the file. You can then view both HTTP Headers, the HTML and JavaScript. Sometimes cookies are set by JavaScript document.cookie, or the page redirected with window.location, or an HTML form's submit button is clicked with JS. In these cases it may be necessary to scrape the cookies and or redirect location from the curl data.
Then I use FireFox Inspector or Chrome Development Tool.
I go to the Network Tab
In FireFox I go to Settings and turn on "Enable Persistent logs"
In Chrome I click "Preserve log" on the Network Tab
Then I use the Browser to go wherever I want curl to go.
Now I can see every Request and Response including redirects and compare them with the save headers.
When you need the header to look exactly like the saved Browser headers:
Create an array to put the Request Header Key Values
Fill in the Request array with exactly what is in the Request header of your upload.
EXAMPLE:
$request = array();
$request[] = "Host: www.example.com";
$request[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$request[] = "User-Agent: MOT-V9mm/00.62 UP.Browser/6.2.3.4.c.1.123 (GUI) MMP/2.0";
$request[] = "Accept-Language: en-US,en;q=0.5";
$request[] = "Connection: keep-alive";
$request[] = "Cache-Control: no-cache";
$request[] = "Pragma: no-cache";
Add to curl:
curl_setopt($ch, CURLOPT_HTTPHEADER, $request);
Many times it is much easier to use a mobile version. Many times the desktop version page requires JavaScript and the mobile version does not. I use FireFox with user agent switcher using an old Motorola user agent to retrieve the headers and HTML. Then I use the same user agent in curl's HTTPHEADER
:
request[] = 'User-Agent: MOT-V9mm/00.62 UP.Browser/6.2.3.4.c.1.123 (GUI) MMP/2.0
end of update
I find curl's cookie jar problematic so I wrote my own routine.
For this CURLOPT_HEADER
must be true.
curl_setopt($ch, CURLOPT_HEADER, true);
$data = curl_exec($ch);
$skip = intval(curl_getinfo($ch, CURLINFO_HEADER_SIZE));
$requestHeader= substr($data,0,$skip);
$data = substr($data,$skip);
$e = 0;
while(true){
$s = strpos($requestHeader,'Set-Cookie: ',$e);
if (!$s){break;}
$s += 12;
$e = strpos($requestHeader,';',$s);
$cookie = substr($requestHeader,$s,$e-$s) ;
$s = strpos($cookie,'=');
$key = substr($cookie,0,$s);
$value = substr($cookie,$s);
$cookies[$key] = $value;
}
Then to use the $cookies[]:
$cookie = '';
$show = '';
$delim = '';
foreach ($cookies as $k => $v){
$cookie .= "$delim$k$v";
$delim = '; ';
}
Then use $cookie:
curl_setopt($ch, CURLOPT_COOKIE, $cookie );
Upvotes: 1