Reputation: 215
I have test.php and on test1.php i have this php code running
<?php
$Text=file_get_contents("http://inviatapenet.gethost.ro/sop/test.php");
preg_match_all('~fid="(.*?)"~si',$Text,$Match);
$fid=$Match[1][1];
echo $fid;
?>
what i want to do is to get the text from test.php
from this fid='gty5etrf' JavaScript an i need just the content of fid
<script type='text/javascript'>fid='gty5etrf'; v_width=620; v_height=490;</script><script type='text/javascript' src='http://www.reyhq.com/player.js'></script>
in the test1.php i need to show only the content
gty5etrf
what i have to do?
Upvotes: 3
Views: 4041
Reputation: 89547
a short pattern:
$pattern = '~\bfid\s*=\s*["\']\K\w+~';
or a long pattern:
$pattern = '~<script[^>]*>(?:[^f<]+|\Bf+|f(?!id\b)|<+(?!/script>))*+\bfid\s*=\s*(["\'])\K[^"\']+(?=\1)~';
the result with
preg_match($pattern, $Text, $match);
$fid = $match[0];
The short pattern finds sequences like:
fid='somechars
fid = "somecchars
The long pattern does the same but also checks you are between script tags.
Using XPath:
$html = <<<'EOD'
<script type='text/javascript'>fid='gty5etrf'; v_width=620; v_height=490;</script><script type='text/javascript' src='http://www.reyhq.com/player.js'></script>
EOD;
$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$query = <<<'EOD'
substring-before(
substring-after(
//script[contains(., "fid='")],
"fid='"
),
"'"
)
EOD;
echo $xp->evaluate($query);
Upvotes: 0
Reputation: 2619
preg_match_all('/fid=\'([^\']+)\'/',$Text,$Match);
Your regex was wrong.
First, you were looking for fid="..."
instead of fid='...'
.
Second, with .*
, the regex would match any character further than the end of the fid
attribute.
Here is the full code :
preg_match_all('/fid=\'([^\']+)\'/',$Text,$Match);
$fid=$Match[1][0];
echo $fid;
Upvotes: 2
Reputation: 21003
you could try the expression fid\=\'([^\']+)\'
as the [^\']+
makes the expression non-greedy in the correct way, also, the expression was wrong as it was looking for double quotes instead of single quotes.
Upvotes: 2
Reputation: 20997
Matching string inside ''
: '(?:[^\\']*|\\.)*'
Matching string inside ""
: "(?:[^\\"]*|\\.)*"
Both of them (ignoring spaces): fid\s*=\s*('(?:[^\\']*|\\.)*'|"(?:[^\\"]*|\\.)*")
And escaped for php:
$regexp = '~fid\\s*=\\s*(\'(?:[^\\\\\']*|\\\\.)*\'|"(?:[^\\\\"]*|\\\\.)*")~';
This will handle correctly even this:
fid = 'foo\'s bar';
Upvotes: 0
Reputation: 563
And this should be
$fid=$Match[1][0];
instead of :
$fid=$Match[1][1];
Upvotes: 0