Chibang Dayanan
Chibang Dayanan

Reputation: 189

Parsing Preformatted Formula to Array Recursively

I am trying to create a formula parser and I am currently stuck on how to recursively convert the code to an array of expressions. Take the following formula for example:

$formula = "
    @VAR[a, 3];
    @IF[ (a <= 3) & (a = 3) ]:
        @VAR[a, a + 4];
        @IF[ a > 5 ]:
            @USE[a];
        @ENDIF
    @ELSEIF[ a > 4 ]:
        @VAR[a, 2];
    @ELSE:
        @VAR[a, 5];
    @ENDIF
    @VAR[a,5];
    @USE[a];
";

Should output:

{
    "0": "VAR[a, 3];",
    "IF[ (a <= 3) & (a = 3) ]:": {
        "0": "VAR[a, a + 4];",
        "1": "ENDIF",
        "IF[ a > 5 ]:": [
            "USE[a];"
        ]
    },
    "ELSEIF[ a > 4 ]:": [
        "VAR[a, 2];"
    ],
    "ELSE:": [
        "VAR[a, 5];"
    ],
    "1": "ENDIF",
    "2": "VAR[a,5];",
    "3": "USE[a];",
}

So that I can iterate through each item and evaluate each expression.

I currently have the following code, which doesn't output the expected format.

<?php

$formula = "
    @VAR[a, 3];
    @IF[ (a <= 3) & (a = 3) ]:
        @VAR[a, a + 4];
        @IF[ a > 5 ]:
            @USE[a];
        @ENDIF
    @ELSEIF[ a > 4 ]:
        @VAR[a, 2];
    @ELSE:
        @VAR[a, 5];
    @ENDIF
    @VAR[a,5];
    @USE[a];
";

$formulas = explode( "@", $formula );
$result = parse( $formulas );

echo json_encode( $result );
function parse( $lines  ){
    $exec_tree = array();
    foreach( $lines as $i => $block ){
        unset( $lines[$i] );
        $block = trim( str_replace( array(" ") , "" , preg_replace('/\s\s+/', ' ', $block) ) );
        if( trim( $block ) != "" ){

            // MATCH Variable assignments
            if( preg_match('/VAR\[(.*)\]\;?/', $block ) ){
                $exec_tree[] = $block;            
            }   
            // MATCH USE Statements
            if( preg_match('/USE\[(.*)\]\;?/', $block ) ){
                $exec_tree[] = $block;
            }      
            // MATCH IFs
            if( preg_match('/^IF\[(.*)\]\:/', $block ) ){
                $exec_tree[$block] = parse( $lines );
            }        
            // MATCH ELSEIFs
            if( preg_match('/^ELSEIF\[(.*)\]\:/', $block ) ){
                $exec_tree[$block] = parse( $lines );
            }    
            // MATCH ELSEs
            if( preg_match('/^ELSE:/', $block ) ){
                $exec_tree[$block] = parse( $lines );
            }      
            // MATCH ENDIFs
            if( preg_match('/^ENDIF/', $block ) ){
                break;
            }
        }
    }
    return $exec_tree;
}

The code is recursive in nature, but I think I am missing something to the termination of the recursion. It's supposed to end on ENDIF keywords. Anyone can point me to the right direction would be much appreciated.

This is its output now: (JSON Formatted)

[
"VAR[a,3];",
[
    "IF[(a<=3)&(a=3)]:",
    [
        "VAR[a,a+4];",
        [
            "IF[a>5]:",
            [
                "USE[a];"
            ]
        ],
        "USE[a];"
    ]
],
"VAR[a,a+4];",
[
    "IF[a>5]:",
    [
        "USE[a];"
    ]
],
"USE[a];"

]

Thanks,

Jan

Upvotes: 1

Views: 48

Answers (1)

Armin Braun
Armin Braun

Reputation: 3683

Admittedly not entirely clean solution, but working and easy to refactor:

<?php

$formula = "
    @VAR[a, 3];
    @IF[ (a <= 3) & (a = 3) ]:
        @VAR[a, a + 4];
        @IF[ a > 5 ]:
            @USE[a];
        @ENDIF
    @ELSEIF[ a > 4 ]:
        @VAR[a, 2];
    @ELSE:
        @VAR[a, 5];
    @ENDIF
    @VAR[a,5];
    @USE[a];
";

$formulas = explode( "@", $formula );
$rec = false;
$result   = parse( $formulas, $rec );

echo json_encode( $result, JSON_PRETTY_PRINT );

function parse( &$lines, &$rec ) {
    $exec_tree = array();
    while ( (bool) $lines === true ) {
        $block = array_shift( $lines );

        $block = trim( str_replace( array( " " ), "", preg_replace( '/\s\s+/', ' ', $block ) ) );
        if ( trim( $block ) != "" ) {

            // MATCH Variable assignments
            if ( preg_match( '/VAR\[(.*)\]\;?/', $block ) ) {
                $exec_tree[] = $block;
            } elseif ( preg_match( '/USE\[(.*)\]\;?/', $block ) ) {
                $exec_tree[] = $block;
            } elseif ( preg_match( '/^IF\[(.*)\]\:/', $block ) ) {
                $rec = true;
                $exec_tree[ $block ] = parse( $lines, $rec );
            } elseif ( preg_match( '/^ELSEIF\[(.*)\]\:/', $block ) ) {
                $rec = !$rec;
                if ( $rec === false ) {
                    array_unshift( $lines, $block );
                    break;
                } else {
                    $exec_tree[ $block ] = parse( $lines, $rec );
                }
            } elseif ( preg_match( '/^ELSE:/', $block ) ) {
                $rec = !$rec;
                if ( $rec === false ) {
                    array_unshift( $lines, $block );
                    break;
                } else {
                    $exec_tree[ $block ] = parse( $lines, $rec );
                }
            } elseif ( preg_match( '/^ENDIF/', $block ) ) {
                $rec = !$rec;
                if ( $rec === false ) {
                    array_unshift( $lines, $block );
                    break;
                } else {
                    $exec_tree[] = $block;
                }
            }
        }
    }

    return $exec_tree;
}

returns

{
    "0": "VAR[a,3];",
    "IF[(a<=3)&(a=3)]:": {
        "0": "VAR[a,a+4];",
        "IF[a>5]:": [
            "USE[a];"
        ],
        "1": "ENDIF"
    },
    "ELSEIF[a>4]:": [
        "VAR[a,2];"
    ],
    "ELSE:": [
        "VAR[a,5];"
    ],
    "1": "ENDIF",
    "2": "VAR[a,5];",
    "3": "USE[a];"
}

The trick really is just to somehow keep track of whether or not you're in a block already and break out on else,elseif and endif, though still appending those values to the end result.

Upvotes: 1

Related Questions