simon
simon

Reputation: 579

How to match multiple expression repeatedly by regexp in lua

I'm learning lua.There are some question in regexp. I have some string as:

text = "aaab1aaac-aac1d2b5hhpt456d5h9h8"

I want get result as:

"b1", "c1b2b5", "t4", "d5h9h8"

I write the code as the following.

local st,ed
while true do
    st,ed = string.find(text,"([a-z][1-9])+",ed)  --the regexp
    if st==nil then
        break
    else
        print(string.sub(text,st,ed))
    end
    ed=ed+1
end

But it does not print the correct results

Upvotes: 2

Views: 4055

Answers (4)

Oliver
Oliver

Reputation: 29543

Here is an alternative to using LPEG, using a straightforward loop works in this case:

function findzigs(text)
    local items = {}
    local zigzag = nil
    local prevI1=-2
    local i1,i2 = text:find("%a%d") 
    while i1~=nil do
        local pair = text:sub(i1,i2)
        if i1-2 == prevI1 then
             zigzag = zigzag .. pair
        else
             if zigzag then table.insert(items, zigzag) end
             zigzag = pair
        end
        prevI1 = i1
        i1,i2 = text:find("%a%d", i2+1) 
    end
    if zigzag then table.insert(items, zigzag) end
    return items
end

Could probably be cleaned up to remove the duplicate "if zigzag" and "text:find" but you get the idea. And it gives exactly the results you need.

Upvotes: 1

lhf
lhf

Reputation: 72312

Try this trick of the trade:

text = "aaab1aaac-aac1d2b5hhpt456d5h9h8"
aux = text:gsub("%l%d","\1\1")

for b,e in aux:gmatch("()\1+()") do
    print(text:sub(b,e-1))
end

Upvotes: 2

Philipp Gesang
Philipp Gesang

Reputation: 526

As @Yu Hao already mentioned in a comment, Lua patterns a different from and somewhat less powerful than the “regex” most of us are used to. But that’s not actually a problem since Lua offers the excellent LPEG library, which was written by one of the language’s main developers.

The pattern you are asking for could be written in LPEG as follows:

local lpeg      = require "lpeg"
local lpegmatch = lpeg.match
local R, C      = lpeg.R, lpeg.C

local match_alpha_n_digit
do
  local alpha       = R "az" -- + R "AZ" -- for uppercase
  local digit       = R "09"
  local sequence    = C ((alpha * digit)^1) -- capture longest sequence of (alpha, digit) pairs
  local pattern     = (sequence + 1)^1
  match_alpha_n_digit = function (str)
    if not str or type (str) ~= "string" then return end
    return lpegmatch (pattern, str)
  end
end

text   = "aaab1aaac-aac1d2b5hhpt456d5h9h8"

print (match_alpha_n_digit (text))
--- or capture the result in a table:
some_table = { match_alpha_n_digit (text) }

This way it comes wrapped in a function match_alpha_n_digit() that you can call inside a table constructor.

It is also possible to write a pattern that receives arbitrary extra arguments which we can then retrieve at match time using the argument capture (lpeg.Carg()). This method allows for instance iterating all matches with a function:

local lpeg      = require "lpeg"
local lpegmatch = lpeg.match
local R, C      = lpeg.R, lpeg.C
local Cmt, Carg = lpeg.Cmt, lpeg.Carg

local iter_alpha_n_digit
do
  local alpha       = R "az"
  local digit       = R "09"
  local sequence    = Cmt (C((alpha * digit)^1) * Carg (1),
                           function (_, _, match, fun)
                             fun (match)
                             return true
                           end)
  local pattern     = (sequence + 1)^1

  iter_alpha_n_digit = function (str, fun)
    if not str or type (str) ~= "string"   then return end
    if not fun or type (fun) ~= "function" then return end
    return lpegmatch (pattern, str, nil, fun)
  end
end

text   = "aaab1aaac-aac1d2b5hhpt456d5h9h8"

iter_alpha_n_digit (text, print) -- iterate matches with the print() function

Here we apply the builtin print() function to the matches, but it could be any other function instead.

Upvotes: 0

Toto
Toto

Reputation: 91438

I don't know lua, but how about this regex:

"((?:[a-z][1-9])+)"

Upvotes: 0

Related Questions