Reputation: 18337
I need to retrieve the html contents (source) of the pages, for example: www.google.com page. Then i can use: file_get_contents
or curl_init
with PHP.
Exactly as someones question before here:
How do I get the HTML code of a web page in PHP?
But more then that for me, some of the pages are Access Required.
But i have granted access and know the password.
(Lets say it ask password with a form and the password is "abcd".)
So how do i read that pages programmatically with PHP?
Updated (the answer, for me):
I found the solution with curl-setopt
suggested by Bekzat Abdiraimov below. Then now i posted detail the codes here that i found somewhere and modified:
<?php
function curl_grab_page($url, $ref_url, $data, $login, $proxy, $proxystatus){
if($login == 'true') {
$fp = fopen("cookie.txt", "w");
fclose($fp);
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
curl_setopt($ch, CURLOPT_TIMEOUT, 40);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
if ($proxystatus == 'true') {
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
}
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $ref_url);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_exec($ch);
curl_setopt($ch,CURLOPT_URL,$ref_url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER, 1);
ob_start();
$data = curl_exec($ch);
ob_end_clean();
curl_close($ch);
return $data;
}
/*
* $auth_processing_url .. is the posted 'action' url in login form like <form method=post action='http://www.abc.com/login.asp'> So it should be like: "http://www.abc.com/login.asp"
* $url_to_go_after_login .. is the url you want to go (to be redireced) after login
* $login_post_values .. are the form input names what Login Form is asking. E.g on form: <input name="username" /><input name="password" />. So it should be: "username=4lvin&password=mypasswd"
*/
echo curl_grab_page($auth_processing_url, $url_to_go_after_login, $login_post_values, "true", "null", "false");
?>
Upvotes: 1
Views: 669
Reputation: 110
Use curl curl_setopt ( resource $ch , int $option , mixed $value )
option = CURLOPT_HTTPAUTH
value = choose auth type (CURLAUTH_BASIC, ...)
http://www.php.net/manual/en/function.curl-setopt.php
Upvotes: 2
Reputation: 522461
It depends on the type of authentication required. If it's the widely used Basic Auth type, it's a trivial header added to the request. You can see the technical details well explained at Wikipedia. To add a header to the request using file_get_contents
, use the $context
parameter, the use of which is explained with an example here.
Upvotes: 0
Reputation: 6394
Look at using a Cookie Jar.
When you first authenticate, the "Cookie" that stores your authentication is lost (assuming your not already using a Cookie Jar), so the next request you make, won't know you have logged in.
As a result, you need to use a Cookie Jar to store the authentication cookie.
http://www.electrictoolbox.com/php-curl-cookies/
Upvotes: 0