Reputation: 326
(I'm using Lua 5.2 and LPeg 0.12)
Suppose I have a pattern P
that produces some indeterminate number of captures, if any, and I want to write create a pattern Q
that captures P
as well as the position after P
--but for that position to be returned before the captures of P
. Essentially, if lpeg.match(P * lpeg.Cp(), str, i)
results in v1, v2, ..., j
, then I want lpeg.match(Q, str, i)
to result in j, v1, v2, ...
.
Is this achievable without having to create a new table every time P
is matched?
Mostly I want to do this to simplify some functions that produce iterators. Lua's stateless iterator functions only get one control variable, and it needs to be the first value returned by the iterator function.
In a world that let people name the last arguments of a variadic function, I could write:
function pos_then_captures(pattern)
local function roll(..., pos)
return pos, (...)
end
return (pattern * lpeg.Cp()) / roll
end
Alas. The easy solution is judicious use of lpeg.Ct()
:
function pos_then_captures(pattern)
-- exchange the order of two values and unpack the first parameter
local function exch(a, b)
return b, unpack(a)
end
return (lpeg.Ct(pattern) * lpeg.Cp()) / exch
end
or to have the caller to lpeg.match
do a pack/remove/insert/unpack dance. And as yucky as the latter sounds, I would probably do that one because lpeg.Ct()
might have some unintended consequences for pathological but "correct" arguments to pos_then_captures
.
Either of these creates a new table every time pattern
is successfully matched, which admittedly doesn't matter too much in my application, but is there a way to do this without any pack-unpack magic?
I'm not too familiar with the internals of Lua, but it feels like what I really want to do is pop something from Lua's stack and put it back in somewhere else, which doesn't seem like an operation that would be directly or efficiently supported, but maybe something that LPeg can do in this specific case.
Upvotes: 1
Views: 195
Reputation: 11991
You can do it with your original solution w/o table captures nor match-time captures like this
function pos_then_captures(pattern)
local function exch(a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, ...)
if a1 == nil then return end
if a2 == nil then return a1 end
if a3 == nil then return a2, a1 end
if a4 == nil then return a3, a1, a2 end
if a5 == nil then return a4, a1, a2, a3 end
if a6 == nil then return a5, a1, a2, a3, a4 end
if a7 == nil then return a6, a1, a2, a3, a4, a5 end
if a8 == nil then return a7, a1, a2, a3, a4, a5, a6 end
if a9 == nil then return a8, a1, a2, a3, a4, a5, a6, a7 end
if a10 == nil then return a9, a1, a2, a3, a4, a5, a6, a7, a8 end
local t = { a10, ... }
return t[#t], a1, a2, a3, a4, a5, a6, a7, a8, a9, unpack(t, 1, #t-1)
end
return (pattern * lpeg.Cp()) / exch
end
Following sample usage returns each matched 'a' with the end of match in front of it
local p = lpeg.P{ (pos_then_captures(lpeg.C'a') + 1) * lpeg.V(1) + -1 }
print(p:match('abababcd'))
-- output: 2 a 4 a 6 a
Upvotes: 0
Reputation: 326
Match-time captures and upvalues get the job done. This function uses Cmt
to ensure pos
is set before sticking it in front of pattern
's captures in pattern / prepend
.
Cmt = lpeg.Cmt
Cp = lpeg.Cp
function prepend_final_pos(pattern)
-- Upvalues are dynamic, so this variable belongs to a
-- new environment for each call to prepend_final_pos.
local pos
-- lpeg.Cmt(patt, func) passes the entire text being
-- searched to `function` as the first parameter, then
-- any captures. Ignore the first parameter.
local function setpos(_, x)
pos = x
-- If we return nothing, Cmt will fail every time
return true
end
-- Keep the varargs safe!
local function prepend(...)
return pos, ...
end
-- The `/ 0` in `Cmt(etc etc) / 0` is to get rid of that
-- captured `true` that we picked up from setpos.
return (pattern / prepend) * (Cmt(Cp(), setpos) / 0)
end
Sample session:
> bar = lpeg.C "bar"
> Pbar = prepend_final_pos(bar)
> print(lpeg.match(Pbar, "foobarzok", 4))
7 bar
> foo = lpeg.C "foo" / "zokzokzok"
> Pfoobar = prepend_final_pos(foo * bar)
> print(lpeg.match(Pfoobar, "foobarzok"))
7 zokzokzok bar
As intended, the actual captures have no influence on the position returned by the new pattern; only the length of the text matched by the original pattern.
Upvotes: 1