aullah
aullah

Reputation: 1384

How to use cURL to fetch text

I'm trying to grab/fetch text from another URL using cURL. The location of where I grab the text from is within a blank HTML document with dynamic (not static) data, therefore there are no HTML tags to filter. This is what I've got so far:

$c = curl_init('http://url.com/dataid='.$_POST['username']);
curl_setopt(CURLOPT_RETURNTRANSFER, true);
curl_setopt(CURLOPT_FRESH_CONNECT, true);

$html = curl_exec($c);

if (curl_error($c))
die(curl_error($c));

// Get the status code
$status = curl_getinfo($c, CURLINFO_HTTP_CODE);

curl_close($c);

This works perfectly, however at the end of dynamic HTML document there is un-required text, "#endofscript" (without quotations). This gets grabbed/fetched, so what can be done to not grab that? I've tried looking at "strpos" and such but I'm unsure on how to integrate that with cURL.

All/Any help will/would be appreciated. :)

EDIT: The code I'm currently using:

<?php

$homepage = file_get_contents('http://stackoverflow.com/');

$result = substr("$homepage", 0, -12);

echo $result;

?>

Upvotes: 1

Views: 5685

Answers (4)

Kuchen
Kuchen

Reputation: 476

You could use preg_replace() to remove all lines starting with a "#" for example:

$res = preg_replace('/^#.*$[\\r\\n]*/m','',$dat);

or just

'/#endofscript$/'

to match the thingie at the end.

substr/str_replace/some other string-functions will work as well.


Some example code how to implement the substr/preg_replace method:

<pre><?php

$dat = 'Lorem ipsum dolor sit amet,
        consectetur adipisicing 
        elit #endofscript';

// either
if (substr($dat,-12) == '#endofscript')
    $res = substr($dat,0,-12);

var_dump($res);

// or
$res = preg_replace('/#endofscript$/','',$dat);
var_dump($res);

?></pre>

Upvotes: 1

Poni
Poni

Reputation: 11317

Since you're saying that this bad text might append to the output, you could use something like this code (wrap it in a function for easier coding experience):

<?php
define("bad_text", "#endofscript");

$feed_text = "here is some text#endofscript";
$bExist = false;
if(strlen($feed_text) >= constant("bad_text"))
{
    $end_of_text = substr($feed_text, strlen($feed_text) - strlen(constant("bad_text")));
    $bExist = strcmp($end_of_text, constant("bad_text")) == 0;
}

if($bExist)
    $final_text = substr($feed_text, 0, strlen($feed_text) - strlen(constant("bad_text")));
else
    $final_text = $feed_text;

echo $final_text;
?>

Upvotes: 1

aullah
aullah

Reputation: 1384

Thank you all for your help, I can't say how much I appreciate them! Using the script given by GOsha, I managed to modify it so that it removes the end text. The code used is as below:

<?php

$homepage = file_get_contents('http://url.com/dataid='.$_POST['username']);

$rest = substr("$homepage", 0, -12);
echo $rest;

?>

This has now been answered. Thank you all, I am very thankful for all your responses. :)

Upvotes: 0

GOsha
GOsha

Reputation: 689

why not to use simply

<?php
$homepage = file_get_contents('http://www.example.com/');
echo $homepage;
?>

http://php.net/manual/en/function.file-get-contents.php

Upvotes: 2

Related Questions