Reputation: 133
Just working on a simple pattern matching. While the below in a way returns what I want. Its not in the way I want. For example the result outputs:
one() two() three()
one
Rather than what I'd want as: one();two();three()
Not sure why its ignoring the implodes delimeter of ; when returning the result. Nor why its including the word 'one', on the next line. It should only be matching for words in its list with the added check to see if that word uses (). Thus leaving words like : 'one' 'two' 'three' alone. Yet matching 'one()' or 'one(somestuff)'.
I have used # as the delimiter in preg_match_all. Newlined the echo of $str so that it didn't create scrollbars on this website. The use of the echo's to format the result are not going to be in the end code. Was simply using them to view how the code was returning.
Any help in how to fix this is much appreciated. I'm probably being a bit daft in code.
On 'Edit 2'. If it serves as a better understanding. I am trying to figure out how to catch commands and or functions that may be passed to eval. I am trying to design a regex that scans for 'function val1(val2)' or 'val1(val2)'. Which also will merge any spaces between 'val1' and '(val2)'. To determine if its a actual command/function. As I know you can do something like 'die("died")' or 'die ("died")'. Wanted it to merge the two words on white space encounters, then evaluate. Still playing around trying to figure this out for myself, and reading. 'Edit 2' also matches most of the words but does not catch all of them. I also do not wish it to make ) as an optional match to the expression. IE it will match 'val1(val2)' and 'val1 (val2)' , but it will ignore matching 'val1(val2' and 'val1 (val2'. Yet when I exclude the ?. It goes back to formating the result without the wanted ; delimiter. I figured that checking if something existed would be faster than checking against an entire whitelist. When there may not even be anything to check. Then only running the seen potential commands against a whitelist, when encountered. That way user submitted '$you = "how are you";' would pass through right to eval. While 'me($you)' would be captured and evaluated against a whitelist.
Simply : To scan for potential commands - ie 'val1(val2)', check it against a whitelist, then eval it. Only system commands would be whitelisted. User commands would just be rejected. ie 'preg_match_all' vs a user created 'myfunc'.
echo $str = " \$me = \"me\"; \$you = \"you\"; \$us = \$me . \$you; echo \$us;
echo \$you; \$t = \"one() two() three()\"; ";
if (preg_match_all('#(one|two|three)\(.*\)#i',$str,$m)){
echo "\n---\n";
foreach ($m as $k => $v){
echo $k . "=" . implode(";",$v)."\n";
}
echo "---\n";
}
Edit 1 :
$str = " \$me = \"me\"; \$you = \"you\"; \$us = \$me . \$you;
echo \$us; echo \$you; \$t = \"one(two) two(1) three(2one)\"; ";
preg_match_all('#((one|two|three)\([\s\S]*?\))#i',$str,$m);
$t = implode(';',$m[1]);
if ($t != null))echo $t;
else echo "empty";
Edit 2 :
$str = " \$me = \"me\"; \$you = \"you\"; \$us = \$me . \$you; echo \$us;
echo \$you; \$t = \"eval($me) one($two) two($1) three($2one)}meme1(you2)
this(that) die (\"meohmy\") function myfunc($me) empty() \"; ";
preg_match_all('#(([\w+]*)\([\s\S]*?\))#i',$str,$m);
$t = implode(';',$m[1]);
if ($t != null){
echo $t;
// check a whitelist for allowed commands vs ones captured here
}else eval($str);
misses : 'die ("meohmy")'
matches : 'die($this)' and 'die($this'. Rather than only matching 'die($this)' and ignoring 'die($this'.
Edit 3:
$str = " \$me = \"me\"; \$you = \"you\"; \$us = \$me . \$you; echo \$us;
echo \$you; \$t = \"eval($me) one($two) two($1) three($2one)}meme1(you2)
this(that) die (\"meohmy\") these function myfunc($me) empty()\"; ";
$cmd = preg_match_all('#(([\w+])*([\s\S])\([\s\S]*?\))#i',$str,$m);
$t = implode(';',$m[1]);
if ($t != null){
echo "Result : " . $t . " [" . $cmd . "]\n";
// check a whitelist for allowed commands vs ones captured here
}else eval($str);
matches: Potential commands with malformed structure. ie 'this($one that($one)' returns 'this($one that($one)'. Rather than returning only 'that($one)' and excluding 'this($one'. Which in the end isn't highly critical, in that it would return a match that is not listed on a whitelist. And execution of eval would just halt.
Edit 4:
See above for included code. Supplying only the regex.
$cmd = preg_match_all('#(\w*)\s*\(.*?\)#i',$str,$m);
Bit cleaner in return with : full match array & word array. Though still looking at how to get it to halt on malformed matches. ie matching 'func(val1)' but excluding 'func(val1'. Since its not a correct pattern. Currently regex will match 'func(val1 somefunc(val2)'. Since it only notices that it hasn't hit ')' on the first match. So it scans until a match, then meshes it all together. Rather than seeing its malformed and excluding it.
Thoughts: A lot of what I was reading was noting that parsing a string via regex for functions, before eval, was ineffective. And that functions desired should be explicitely called. I think these same people may have forgotten about function_exists. Or that a better designed regex than even mine. Could effectively collate data to pass to function_exists. Hopefully can get my regex in time to be better than what it currently is as well. So the end thought is. Regex a string for functions, collate them, then scan them using function_exists. Then after all that and maybe a sanitize, eval it.
Example Return : eval($me) would output into the two arrays : eval($me) and eval.
My edits should show I actively would love help. Though I am also trying to figure it out for myself. So not just a can you do for me question =). I seek to understand so I can do for myself.
input :
eval($me) one($two) two($1) three($2one)}meme1(you2) this(that) die (\"meohmy\") function myfunc($me holycrackers batman empty()
output:
array (
0 =>
array (
0 => 'eval($me)',
1 => 'one($two)',
2 => 'two($1)',
3 => 'three($2one)',
4 => 'meme1(you2)',
5 => 'this(that)',
6 => 'die (\\"meohmy\\")',
7 => 'myfunc($me holycrackers batman empty()',
),
1 =>
array (
0 => 'eval',
1 => 'one',
2 => 'two',
3 => 'three',
4 => 'meme1',
5 => 'this',
6 => 'die',
7 => 'myfunc',
),
)
Upvotes: 2
Views: 310
Reputation: 3484
I'm not sure I got your question right, but is that what you expected?
<?php
$str = " \$me = \"me\"; \$you = \"you\"; \$us = \$me . \$you; echo \$us;
echo \$you; \$t = \"one() two() three()\"; ";
$matches = array();
preg_match_all('/((one|two|three)\([\s\S]*?\))/i',$str,$matches);
echo implode(';',$matches[1]);
Upvotes: 1