Reputation: 45
I've done scraping for lots of sites but one in particular isn't saving it's cookies to my cookie file. Any ideas?
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_TIMEOUT,8200);
curl_setopt($ch,CURLOPT_TIMEOUT_MS,8200);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,8200);
$cookie_file = "cookies/zapper.txt";
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
if ($fields) {curl_setopt($ch,CURLOPT_POST, count($fields)); }
if ($fields) {curl_setopt($ch,CURLOPT_POSTFIELDS, $fields_string); }
This is the first site that I've done that doesn't respond to my cookie saves. All others use the same code and work perfectly. I've even emulated the post of their forms and faked the header in case it was checking [those.
The site I'm trying to mimic an add to cart process for is http://zapper.co.uk/
Upvotes: 0
Views: 2306
Reputation: 2017
Read a possible solution directly from the php.net site about curl_setopt. It's a workaround to get Cookie content from the header output. Seems to be a cool alternative.
Also, you can get surprising results modifiying some of your rules at curl_setop. Sometimes we use more options than needed.
I also recommend you to echo
the whole $ch
content (It will print page like the browser does). Sometimes you get a detailed error not present at headers seeing the live result content.
Upvotes: 1