Reputation: 21007
I have a simple regex which checks an entire string for a function declaration. So in this code:
public function Test($name)
{
echo 'In test';
}
It will find the first part:
function Test($name)
{
And it replaces that with a custom piece:
function Test($name)
{
echo 'New piece';
Which eventually makes my code look like this:
public function Test($name)
{
echo 'New piece';
echo 'In test';
}
This all works perfectly fine with this regex:
preg_match_all ( '/function(.*?)\{/s', $source, $matches )
The problem is, is that i want to ignore everything when the regex sees a script tag. So in this case, this source:
public function Test($name) //<--- Match found!
{
echo 'In test';
}
<script type="text/javascript"> //<--- Script tag found, dont do any matches!
$(function() {
function Test()
{
var bla = "In js";
}
});
</script> //<--- Closed tag, start searching for matches again.
public function Test($name) //<--- Match found!
{
echo 'In test';
}
How can i do this in my regex?
Upvotes: 2
Views: 944
Reputation: 10269
No amount of regex is going to achieve a decent fail-proof solution.
The right way to do this is with php tokenizer.
<?php
$code = <<<END
<?php
public function Test(\$name) //<--- Match found!
{
echo 'In test';
}
?>
<script type="text/javascript"> //<--- Script tag found, dont do any matches!
$(function() {
function Test()
{
var bla = "In js";
}
});
</script> //<--- Closed tag, start searching for matches again.
<?
public function Bla(\$name) //<--- Match found!
{
echo 'In test';
}
END;
function injectCodeAtFunctionsStart ($originalCode, $code)
{
$tokens = token_get_all ($originalCode);
$newTokenTree = '';
// iterate tokens
for ($i = 0, $total = count($tokens); $i < $total; $i++)
{
$node = $tokens[$i];
$newTokenTree[] = $node;
if (is_array ($node))
{
// function start
if ($node[0] == T_FUNCTION)
{
// walk to first brace
while ($tokens[$i] !== '{') {
$newTokenTree[] = $tokens[$i];
$i++;
}
$i++;
// keep space
$space = $tokens[$i];
$newTokenTree[] = $space;
// add new piece
$newTokenTree[] = $code;
$newTokenTree[] = $space;
}
}
}
// rebuild code from tokens
$content = '';
foreach ($newTokenTree as $node) {
$content .= is_scalar ($node) ? $node : $node[1];
}
return $content;
}
echo injectCodeAtFunctionsStart ($code, 'echo "new piece";');
Upvotes: 0
Reputation: 8550
As mentioned in the comments:
If your php functions always have a visibility modifier like public
you could do:
(?:public|protected|private)\s+function\s+\w+\(.*?\)\s*\{
Otherwise, you could strip the script part first. Something like:
$text = preg_replace('/<script(?:(?!<\/script>).)*<\/script>/s','',$text);
Upvotes: 1
Reputation: 8350
I don't know python, but I know regex:
Your original regex is not so good, since it matches
// This is a functional comment { isn't it? }
^^^^^^^^...........^
Maybe if you make it more robust it will solve your problem:
^\s*(public|protected|private)\s+function\s+\(.*?\).*?{
This will ensure it is a function declaration for 99% of the cases. There are still some unusual cases where you can fool it.
Upvotes: 1