夏期劇場
夏期劇場

Reputation: 18337

PHP to retrieve the content of the Password Required pages once i have password?

I need to retrieve the html contents (source) of the pages, for example: www.google.com page. Then i can use: file_get_contents or curl_init with PHP.

Exactly as someones question before here:
How do I get the HTML code of a web page in PHP?

But more then that for me, some of the pages are Access Required.
But i have granted access and know the password.

(Lets say it ask password with a form and the password is "abcd".)

So how do i read that pages programmatically with PHP?

Updated (the answer, for me):
I found the solution with curl-setopt suggested by Bekzat Abdiraimov below. Then now i posted detail the codes here that i found somewhere and modified:

<?php
function curl_grab_page($url, $ref_url, $data, $login, $proxy, $proxystatus){
    if($login == 'true') {
        $fp = fopen("cookie.txt", "w");
        fclose($fp);
    }

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
    curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
    curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
    curl_setopt($ch, CURLOPT_TIMEOUT, 40);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);

    if ($proxystatus == 'true') {
        curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);
        curl_setopt($ch, CURLOPT_PROXY, $proxy);
    }

    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);

    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_REFERER, $ref_url);

    curl_setopt($ch, CURLOPT_HEADER, TRUE);
    curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($ch, CURLOPT_POST, TRUE);
    curl_setopt($ch, CURLOPT_POSTFIELDS, $data);

    curl_exec($ch);

    curl_setopt($ch,CURLOPT_URL,$ref_url);
    curl_setopt($ch,CURLOPT_RETURNTRANSFER, 1);

    ob_start();
    $data = curl_exec($ch);
    ob_end_clean();

    curl_close($ch);
    return $data;
}

/*
 * $auth_processing_url .. is the posted 'action' url in login form like <form method=post action='http://www.abc.com/login.asp'> So it should be like: "http://www.abc.com/login.asp"
 * $url_to_go_after_login .. is the url you want to go (to be redireced) after login
 * $login_post_values .. are the form input names what Login Form is asking. E.g on form: <input name="username" /><input name="password" />. So it should be: "username=4lvin&password=mypasswd"
 */
echo curl_grab_page($auth_processing_url, $url_to_go_after_login, $login_post_values, "true",  "null", "false");
?>

Upvotes: 1

Views: 669

Answers (3)

Bekzat Abdiraimov
Bekzat Abdiraimov

Reputation: 110

Use curl curl_setopt ( resource $ch , int $option , mixed $value )

option = CURLOPT_HTTPAUTH
value = choose auth type (CURLAUTH_BASIC, ...)

http://www.php.net/manual/en/function.curl-setopt.php

Upvotes: 2

deceze
deceze

Reputation: 522461

It depends on the type of authentication required. If it's the widely used Basic Auth type, it's a trivial header added to the request. You can see the technical details well explained at Wikipedia. To add a header to the request using file_get_contents, use the $context parameter, the use of which is explained with an example here.

Upvotes: 0

Gavin
Gavin

Reputation: 6394

Look at using a Cookie Jar.

When you first authenticate, the "Cookie" that stores your authentication is lost (assuming your not already using a Cookie Jar), so the next request you make, won't know you have logged in.

As a result, you need to use a Cookie Jar to store the authentication cookie.

http://www.electrictoolbox.com/php-curl-cookies/

Upvotes: 0

Related Questions