Reputation: 1167
I should firstly apologize for my probably rookie question, but I've just got no clue how to achieve that relatively complex task being a complete newbie regarding regex. What I need is to specify a validation pattern for a string input and perform separate checks on the separate segments of that pattern. So let's begin with the task itself. I'm working with php7.0 on laravel 5.4 (which should genuinely not make any difference) and I need to somehow produce a matching pattern for a string input, which pattern is the following:
header1: expression1; header2: expression2; header3: expression3 //etc...
What I'd need here is to check if each header is present and if it's present in a special validation list of available headers. So I'd need to extract each header.
Furthermore the expressions are built as follows
expression1 = (a1 + a2)*(a3-a1)
expression2 = b1*(b2 - b3)/b4
//etc...
The point is that each expression contains some numeric parameters which should form a valid arithmetic calculation. Those parameters should also be contained in a special list of available parameter placeholders, so I'd need to check them too. So, is there a simple efficient way (using regex and string analysis in pure php) to specify that strict structure or should I do everything step by step with exploding and try-catching?
An optimal solution would be a shorthand logic (or regex expression?) of a kind like:
$value->match("^n(header: expression)")
->delimitedBy(';')
->where(in_array($header, $allowed_headers))
->where(strtr($expression, array_fill_keys($available_param_placeholders, 0))->isValidArithmeticExpression())
I hope you can follow my logic. The code above would read as: Match N repetitions of the pattern "header: expression", delimited by ';', where 'header' (given that $header is its value) is in an array and where 'expression' (given that $expression is its value) forms a valid arithmetic expression when all available parameter placeholders have been replaced by 0. That's it all. Each deviation of that strict pattern should return false.
As an alternative I'm currently thinking of something like firstly exploding the string by the main delimiter (the semicolon) and then analysing each part separately. So I'll then have to check if there is a colon present, then if everything to the left of the colon matches a valid header name and if everythin to the right of the column forms a valid arithmetic expression when all param names from the list are replaced by a random value (like 0, just to check if the code executes, which I also don't know how to do). Anyway, that way seems like an overkill and I'm sure there should be a smoother way to specify the needed pattern.
I hope I've explained everything good enough and sorry if I'm being to messy explaining my problem. Thanks in advance for each piece of advice/help! Greatly appreciated!
Upvotes: 1
Views: 214
Reputation: 1167
Regex appeared to be not as complicated as I thought when posting that question, so I've managed to achieve the pattern in its complete form by myself with the initial headstart owed to @mickmackusa. What I have finally come up with is that here, explained to you by regex101
itself: https://regex101.com/r/UHMrqL/1
The logic whic it's based on is described in the initial question. The only thing missing is the verification of the values of the headers and the names of the params, but that's easy to match afterwards with preg_match_all
and verify with pure php checks. Thanks again for the attention and the help! :)
Upvotes: 0
Reputation: 48071
Using eval()
must always be Plan Z. With my understanding of your input string, this method may sufficiently validate the headers and expressions (if not, I think it should sufficiently sanitize the string for arithmetic parsing). I don't code in Laravel, so if this can be converted to Laravel syntax I'll leave that job for you.
Code: (Demo)
$test = "header1: (a1 + a2)*(a3-a1); header2: b1*(b2 - b3)/b4; header3: c1 * (((c2); header4: ((a1 * (a2 - b1))/(a3-a1))+b2";
$allowed_headers=['header1','header3','header4'];
$pairs=explode('; ',$test);
foreach($pairs as $pair){
list($header,$expression)=explode(': ',$pair,2);
if(!in_array($header,$allowed_headers)){
echo "$header is not permitted.";
}elseif(!preg_match('~^((?:[-+*/ ]+|[a-z]\d+|\((?1)\))*)$~',$expression)){ // based on https://stackoverflow.com/a/562729/2943403
echo "Invalid expression @ $header: $expression";
}else{
echo "$header passed.";
}
echo "\n---\n";
}
Output:
header1 passed.
---
header2 is not permitted.
---
Invalid expression @ header3: c1 * (((c2)
---
header4 passed.
---
I will admit the above pattern will match (+ )( +)
so it is not the breast best pattern. So perhaps your question may be a candidate for using eval()
. Although you may want to consider/research some of the github creations / plugins / parsers that can parse/tokenize an arithmetic expressions first.
Perhaps:
Any $pair
that gets past the if
and the elseif
can move onto the evaluation process in the else
.
I'll give you a headstart/hint about some general handling, but I'll shy away from giving any direct instruction to avoid the wrath of a certain population of critics.
}else{
// replace all variables with 0
//$expression=preg_replace('/[a-z]\d+/','0',$expression);
// or replace each unique variable with a whole number
$expression=preg_match_all('/[a-z]\d+/',$expression,$out)?strtr($expression,array_flip($out[0])):$expression; // variables become incremented whole numbers
// ... from here use $expression with eval() in a style/intent of your choosing.
// ... set a battery of try and catch statements to handle unsavory outcomes.
// https://www.sitepoint.com/a-crash-course-of-changes-to-exception-handling-in-php-7/
}
Upvotes: 1
Reputation: 334
$test = "header1: (a1 + a2)*(a3-a1); header2: b1*(b2 - b3)/b4; header3: expression3";
$pairs = explode(';', $test);
$headers = [];
$expressions = [];
foreach ($pairs as $p) {
$he = explode(':', $p);
$headers[] = trim($he[0]);
$expressions[] = trim($he[1]);
}
foreach ($headers as $h) {
if (!in_array($h, $allowed_headers)) {
return false;
}
}
foreach ($expressions as $e) {
preg_match_all('/[a-z0-9]+/', $e, $matches);
foreach ($matches as $m) {
if (param_fails($m)) {
echo "Expression $e contains forbidden param $m.";
}
}
}
Upvotes: 0