Reputation: 99
How do I remove lines from a string begins with another string in Lua ? For instance i want to remove all line from string result
begins with the word <Table
. This is the code I've written so far:
for line in result:gmatch"<Table [^\n]*" do line = "" end
Upvotes: 2
Views: 7739
Reputation: 28874
result = result:gsub('%f[^\n%z]<Table [^\n]*', '')
The start of this pattern, '%f[^\n%z]
, is a frontier pattern which will match any transition from either a newline or zero character to another character, and for frontier patterns the pre-first character counts as a zero character. In other words, using that prefix allows the rest of the pattern to match at either the first line or any other start-of-line.
Reference: the Lua 5.3 manual, section 6.4.1 on string patterns
Upvotes: 0
Reputation: 43326
The other answers provide good solutions to actually stripping lines from a string, but don't address why your code is failing to do that.
Reformatting for clarity, you wrote:
for line in result:gmatch"<Table [^\n]*" do
line = ""
end
The first part is a reasonable way to iterate over result
and extract all spans of text that begin with <Table
and continue up to but not including the next newline character. The iterator returned by gmatch
returns a copy of the matching text on each call, and the local variable line
holds that copy for the body of the for
loop.
Since the matching text is copied to line
, changes made to line
are not and cannot modifying the actual text stored in result
.
This is due to a more fundamental property of Lua strings. All strings in Lua are immutable. Once stored, they cannot be changed. Variables holding strings are actually holding a pointer into the internal table of reference counted immutable strings, which permits only two operations: internalization of a new string, and deletion of an internalized string with no remaining references.
So any approach to editing the content of the string stored in result
is going to require the creation of an entirely new string. Where string.gmatch
provides an iteration over the content but cannot allow it to be changed, string.gsub
provides for creation of a new string where all text matching a pattern has been replaced by something new. But even string.gsub
is not changing the immutable source text; it is creating a new immutable string that is a copy of the old with substitutions made.
Using gsub
could be as simple as this:
result = result:gsub("<Table [^\n]*", "")
but that will disclose other defects in the pattern itself. First, and most obviously, nothing requires that the pattern match at only the beginning of the line. Second, the pattern does not include the newline, so it will leave the line present but empty.
All of that can be refined by careful and clever use of the pattern library. But it doesn't change the fact that you are starting with XML text and are not handling it with XML aware tools. In that case, any approach based on pattern matching or even regular expressions is likely to end in tears.
Upvotes: 1
Reputation: 526
The LPEG library is perfect for this kind of task. Just write a function to create custom line strippers:
local mk_striplines
do
local lpeg = require "lpeg"
local P = lpeg.P
local Cs = lpeg.Cs
local lpegmatch = lpeg.match
local eol = P"\n\r" + P"\r\n" + P"\n" + P"\t"
local eof = P(-1)
local linerest = (1 - eol)^1 * (eol + eof) + eol
mk_striplines = function (pat)
pat = P (pat)
local matchline = pat * linerest
local striplines = Cs (((matchline / "") + linerest)^1)
return function (str)
return lpegmatch (striplines, str)
end
end
end
Note that the argument to mk_striplines()
may be a string or a
pattern.
Thus the result is very flexible:
mk_striplines (P"<Table" + P"</Table>")
would create a stripper
that drops lines with two different patterns.
mk_striplines (P"x" * P"y"^0)
drops each line starting with an
x
followed by any number of y
’s -- you get the idea.
Usage example:
local linestripper = mk_striplines "foo"
local test = [[
foo lorem ipsum
bar baz
buzz
foo bar
xyzzy
]]
print (linestripper (test))
Upvotes: 1
Reputation: 122383
string.gmtach
is used to get all occurrences of a pattern. For replacing certain pattern, you need to use string.gsub
.
Another problem is your pattern <Table [^\n]*
will match all line containing the word <Table
, not just begins with it.
Lua pattern doesn't support beginning of line anchor, this almost works:
local str = result:gsub("\n<Table [^\n]*", "")
except that it will miss on the first line. My solution is using a second run to test the first line:
local str1 = result:gsub("\n<Table [^\n]*", "")
local str2 = str1:gsub("^<Table [^\n]*\n", "")
Upvotes: 1