Jens Modvig
Jens Modvig

Reputation: 183

Split String on delimiter and spaces in Lua

The Problem

I am trying to syntax highlight some Lua source code, therefore I am trying to split a string of code into a table of operators, space characters and variable names.

The trouble is: I have a table of multiple separators and I want to split the string on these separators, but also keep an entry of the separators and all connected space characters:

Example:

"v1 *=3"

becomes

{'v1', ' ', '*=', '3'}

This question is awfully similair to Split String and Include Delimiter in Lua and How do I split a string with multiple separators in lua?

My question however differs in that i want to keep an entry of all the separators beside each other in one entry and I can't seem to create the right pattern.

What I have tried:

local delim = {",", ".", "(", ")", "=", "*"}
local s = "local variable1 *=get_something(5) if 5 == 4 then"
local p = "[^"..table.concat(delim).."%s]+"

for a in s:gsub(p, '\0%0\')gmatch'%Z+' do
    print(a)
end

Actual results:

{'local', ' ', 'variable1', ' *=', 'get_something', '(', '5', ') ', 'if', ' ', '5', ' == ', '4', ' ', 'then'}

Expected results:

{'local', ' ', 'variable1', ' ', '*=', 'get_something', '(', '5', ')', ' ', 'if', ' ', '5', ' ', '==', ' ', '4', ' ', 'then'}

Its a small difference, look for where the spaces are, all connected spaces should be in their own entry.

Upvotes: 1

Views: 1571

Answers (2)

Jens Modvig
Jens Modvig

Reputation: 183

After some time i came to a solution, just posting it here if anyone should be interested.

local delim = {",", ".", "(", ")", "=", "*"}
local s = "local variable1 *=get_something(5) if 5 == 4 then"
local p = "[^"..table.concat(delim).."]+"

-- Split strings on the delimeters, but keep them as own entry
for a in s:gsub(p, '\0%0\')gmatch'%Z+' do

    -- Split strings on the spaces, but keep them as own entry
    for a2 in a:gsub("%s+", '\0%0\')gmatch'%Z+' do
        print(a2)
    end
end

Upvotes: 0

Josh
Josh

Reputation: 3265

EDIT The following seems to work for everything EXCEPT the *=. Still working on that, but here's the code for most everything else:

local delim = {"*=",",", ".", "(", ")", "=", " "}
local str = "local variable1 *=get_something(5) if 5 == 4 then"

local results = {}
local toutput = ""

function makeTable(str)
    for _,v in ipairs(delim) do
        str = str:gsub("([%"..v.."]+)", "`%1`")
    end
    for item in str:gmatch("[^`]+") do table.insert(results, item) end

    for _,v in ipairs(results) do
      toutput = toutput .. "'" .. v .. "',"
    end

    print("[" .. toutput .. "]")
end

makeTable(str)

It returns:

['local',' ','variable1',' ','*','=','get_something','(','5',')',' ','if',' ','5',' ','==',' ','4',' ','then',]

Hopefully this gets you one step closer.

Upvotes: 1

Related Questions