Portaltv Romania
Portaltv Romania

Reputation: 215

preg_match get text

I have test.php and on test1.php i have this php code running

<?php 
$Text=file_get_contents("http://inviatapenet.gethost.ro/sop/test.php");
 preg_match_all('~fid="(.*?)"~si',$Text,$Match);
 $fid=$Match[1][1];
 echo $fid;
?>

what i want to do is to get the text from test.php

from this fid='gty5etrf' JavaScript an i need just the content of fid

<script type='text/javascript'>fid='gty5etrf'; v_width=620; v_height=490;</script><script type='text/javascript' src='http://www.reyhq.com/player.js'></script>

in the test1.php i need to show only the content

gty5etrf

what i have to do?

Upvotes: 3

Views: 4041

Answers (5)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

a short pattern:

$pattern = '~\bfid\s*=\s*["\']\K\w+~';

or a long pattern:

$pattern = '~<script[^>]*>(?:[^f<]+|\Bf+|f(?!id\b)|<+(?!/script>))*+\bfid\s*=\s*(["\'])\K[^"\']+(?=\1)~';

the result with

preg_match($pattern, $Text, $match);
$fid = $match[0];

The short pattern finds sequences like:

fid='somechars
fid  = "somecchars

The long pattern does the same but also checks you are between script tags.


Using XPath:

$html = <<<'EOD'
<script type='text/javascript'>fid='gty5etrf'; v_width=620; v_height=490;</script><script type='text/javascript' src='http://www.reyhq.com/player.js'></script>
EOD;

$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$query = <<<'EOD'
    substring-before(
        substring-after(
            //script[contains(., "fid='")],
            "fid='"
        ),
        "'"
    )
EOD;

echo $xp->evaluate($query);

Upvotes: 0

ibi0tux
ibi0tux

Reputation: 2619

 preg_match_all('/fid=\'([^\']+)\'/',$Text,$Match);

Your regex was wrong. First, you were looking for fid="..." instead of fid='...'. Second, with .*, the regex would match any character further than the end of the fid attribute.

Here is the full code :

preg_match_all('/fid=\'([^\']+)\'/',$Text,$Match);
$fid=$Match[1][0];
echo $fid;

Upvotes: 2

bizzehdee
bizzehdee

Reputation: 21003

you could try the expression fid\=\'([^\']+)\' as the [^\']+ makes the expression non-greedy in the correct way, also, the expression was wrong as it was looking for double quotes instead of single quotes.

Upvotes: 2

Vyktor
Vyktor

Reputation: 20997

Matching string inside '': '(?:[^\\']*|\\.)*'

Matching string inside "": "(?:[^\\"]*|\\.)*"

Both of them (ignoring spaces): fid\s*=\s*('(?:[^\\']*|\\.)*'|"(?:[^\\"]*|\\.)*")

And escaped for php:

$regexp = '~fid\\s*=\\s*(\'(?:[^\\\\\']*|\\\\.)*\'|"(?:[^\\\\"]*|\\\\.)*")~';

This will handle correctly even this:

fid  = 'foo\'s bar';

Upvotes: 0

Tounu
Tounu

Reputation: 563

And this should be

$fid=$Match[1][0];

instead of :

$fid=$Match[1][1];

Upvotes: 0

Related Questions