William
William

Reputation: 19

PHP - manage curl output

based on my last question, i sent request to website and it show me output. But, output show me the full website. i want get only some data like link in curl output.

$url = 'http://site1.com/index.php';
$data = ["send" => "Test"];
$ch = curl_init($url);

curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

$response = curl_exec($ch);
curl_close($ch);
var_dump($response);

this code show me what i want , but the output contain full website. i just want get some data and show in out put.

Upvotes: 0

Views: 540

Answers (1)

S. Imp
S. Imp

Reputation: 2877

You can use preg_match_all and a carefully constructed pattern. This modified version of your code should give you a list of all the image urls in the HTML that you retrieve:

        $url = 'http://site1.com/index.php';
        $data = ["send" => "Test"];
        $ch = curl_init($url);

        curl_setopt($ch, CURLOPT_POST, 1);
        curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

        $response = curl_exec($ch);
        curl_close($ch);


        $matches = NULL;
        $pattern = '/<img[^>]+src=\"([^"]+)"[^>]*>/';
        $img_count = preg_match_all($pattern, $response, $matches);

        var_dump($matches[1]);

If you'd like to fetch all the links instead, you can change $pattern to this:

        $pattern = '/<a[^>]+href=\"([^"]+)"[^>]*>/';

I have tested this code on an html file that looks like this:

<html>
<body>
<div><img src="WANT-THIS"></div>
</body>
</html>

And the output is this:

array(1) {
  [0]=>
  string(9) "WANT-THIS"
}

EDIT 2: In response to additional questions from the OP, I have also tried the script on this html file:

<html>
<body>
<div1>CODE</div><div2>CODE</div><div3>CODE</div><div4>CODE</div><div5>CODE</div><div6>CODE</div><img src="IMAGE">
</body>
</html>

And it produces this result:

array(1) {
  [0]=>
  string(5) "IMAGE"
}

If this doesn't solve your problem, you'll need to provide additional detail -- either an example url that you are fetching, some HTML that you want to search, or extra detail about how you might know which image in the HTML you want to grab -- does it have some special id? Is it always the first image? The second image? Is there any characteristic by which we know which image to grab?

Upvotes: 1

Related Questions