Reputation: 1583
i have following PHP curl and regex code. i'd like to get post header from website. In actual, there are 10 articles. but code returns zero result.
PHP:
<?php
$ch = curl_init();
$url = "www.mahsumakbas.net";
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$content = curl_exec($ch);
curl_close($ch);
@preg_match_all('/<h2 class="entry-title">(.*)<\/h2>/' ,$content, $matches);
for ($i=0; $i< sizeof($matches[1]); $i++)
echo $matches[1][$i]."<br/>";
?>
On www.mahsumakbas.net web page there are 10 <h2 class="entry-title">
enclosed with </h2>
what do i miss?
Upvotes: 0
Views: 55
Reputation: 11859
Try this:
$url = "www.mahsumakbas.net";
$c = curl_init($url);
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
//curl_setopt(... other options you want...)
$html = curl_exec($c);
curl_close($c);
preg_match_all("'<h2 class=\"entry-title\">(.*?)</h2>'si" ,$html, $matches);
foreach($matches[1] as $key=>$val)
echo $val."<br/>";
Upvotes: 1
Reputation: 1796
Your headlines are build in 3 lines. You must set the "m"-option. Maybe it helps.
http://php.net/manual/en/reference.pcre.pattern.modifiers.php
But for parsing a HTML-DOM string you should use DOMDocument with getElementByTagName
Upvotes: 0