David
David

Reputation: 733

Is it possible to fix this gsub pattern?

I'm messing around with Lua trying to create my own "scripting language".

It's actually just a string that is translated to Lua code, then executed through the use of loadstring. I'm having a problem with my string patterns. When you branch (for example, defining a variable inside of a variable declaration) it errors. For example, the following code would error:

local code = [[
    define x as private: function()
        define y as private: 5;
    end;
]]
--defining y inside of another variable declaration, causes error

This is happening because the pattern to declare a variable first looks for the keyword 'define', and captures everything until a semicolon is found. Therefore, x would be defined as:

function()
    define y as private: 5 --found a semicolon, set x to capture

I guess my question is, is it possible to ignore semicolons until the correct one is reached? Here is my code so far:

local lang = {
    ["define(.-)as(.-):(.-);"] = function(m1, m2, m3) 
        return (
            m2 == "private" and " local " .. m1 .. " = " .. m3 .. " " or 
            m2 == "global" and " " .. m1 .. " = " .. m3 .. " " or
            "ERROR IN DEFINING " .. m1
        )
    end,
}

function translate(code)
    for pattern, replace in pairs(lang) do
        code = code:gsub(pattern, replace)
    end
    return code
end

local code = [[

    define y as private: function()
        define x as private: 10;
    end;

]]

loadstring(translate(code:gsub("%s*", "")))()
--remove the spaces from code, translate it to Lua code through the 'translate' function, then execute it with loadstring

Upvotes: 0

Views: 109

Answers (1)

Moop
Moop

Reputation: 3611

The easiest solution is to to change your last capture group from

(.-) -- 0 or more lazy repetitions

to

(.*) -- 0 or more repetitions

i.e.

pattern = 'define(.-)as(.-):(.*);'

The - modifier according to PiL matches the shortest sequence.

However, as noted in my comment, I wouldn't advise writing a parser for your language using pattern matching. It will either require really complicated patterns (to prevent edge-cases) and probably be unclear to others.

Upvotes: 1

Related Questions