developer
developer

Reputation: 2050

how to parse a page that is going on 302 header?

i have to parse a page in php,the url of the page is going on 302 Moved temporarily header and is moved to a not found page.Its data can be retrieved manually through console option of firebug add on of mozilla.But if i try to parse it using php it gives me that not found page in return.How can i parse that page please suggest??

edit: iam doing something like this to get the page's content

$file_results = @fopen("http://www.the url to be parses","rb");
    $parsed_results='';
    if($file_results)
    {
        while ($data3 = fread($file_results,"125000"))
        $parsed_results .= $data3;
    }

Upvotes: 0

Views: 1359

Answers (2)

Owen
Owen

Reputation: 84493

You can use get_headers() to find all the headers while you're being redirected.

$url = 'http://google.com';
$headers = get_headers($url, 1);

print 'First step gave: ' . $headers[0] . '<br />';

// uncomment below to see the different redirection URLs
// print_r($headers['Location']);

// $headers['Location'] will contain either the redirect URL, or an array
// of redirection URLs
$first_redirect_url = isset($headers['Location'][0]) ?
    $headers['Location'][0] : $headers['Location'];

print "First redirection is to: {$first_redirect_url}<br />";

// assuming you have fopen wrappers enabled...
print file_get_contents($first_redirect_url);

And just keep looking till you get the resource you want?

Upvotes: 1

captncraig
captncraig

Reputation: 23068

You need to read the header, see where it is redirecting you, and make another request to get the actual resource. Kind of a pain, but thats how the protocol works. Most browsers do this transparently.

Upvotes: 0

Related Questions