Tooti Tooti
Tooti Tooti

Reputation: 109

PHP: my regex code not work on local but on regex101 work fine

I use regex101 for test my regex

This is my regex :

<a href="/name/nm0000130/\?ref_=ttfc_fc_cr8">(.*)</a>

And this is code :

<tr>
  <td class="name">
    <a href="/name/nm0000130/?ref_=ttfc_fc_cr8"> Jamie Lee Curtis
    </a>
  </td>
  <td>...</td>
  <td class="credit">
    executive producer
  </td>
</tr>

This code work fine on regex101, but if i get data by file_get_contents and use this, my regex in php not work

I sure data load complete

My php code :

$data = file_get_contents('https://www.imdb.com/title/tt'.$tt.'/fullcredits', false, stream_context_create($contextOption));
preg_match_all('~<a href="/name/nm0000130/\?ref_=ttfc_fc_cr8">(.*)</a>~isU', $data, $return);

Other my regex code for this page work fine, but this is not work

My code;

$contextOption = array("ssl" => array(  "verify_peer" => false,
                                                "verify_peer_name" => false,
                                                "allow_self_signed" => true));

$data = file_get_contents('https://www.imdb.com/title/tt1502407/fullcredits', false, stream_context_create($contextOption));
preg_match_all('~<a href="/name/nm0000130/.ref_=ttfc_fc_cr8"(.*)</a>~isU', $data, $return);

Upvotes: 0

Views: 113

Answers (1)

You Old Fool
You Old Fool

Reputation: 22960

If you want to parse html don't use a regex. Instead use DOMDocument or some other tool made for the job.

Here's a basic example of how you could approach the same thing using the DOMXpath class:

// get the html
$contextOption = ["ssl" => ["verify_peer" => false, "verify_peer_name" => false, "allow_self_signed" => true]];
$data = file_get_contents('https://www.imdb.com/title/tt1502407/fullcredits', false, stream_context_create($contextOption));

// load the html into DOMDocument
$dom = new DOMDocument();
@$dom->loadHTML($data);
$xpath = new DomXPath($dom);

// get anchor tag with href matching
$anchor = $xpath->query('//a[@href="/name/nm0000130/?ref_=ttfc_fc_cl_t1"]');

echo $anchor->item(0)->textContent;

OUTPUT:

Jamie Lee Curtis

Upvotes: 2

Related Questions