theta
theta

Reputation: 25611

Escaping strings for gsub

I read a file:

local logfile = io.open("log.txt", "r")
data = logfile:read("*a")
print(data)

output:

...
"(\.)\n(\w)", r"\1 \2"
"\n[^\t]", "", x, re.S
...

Yes, logfile looks awful as it's full of various commands

How can I call gsub and remove i.e. "(\.)\n(\w)", r"\1 \2" line from data variable?

Below snippet, does not work:

s='"(\.)\n(\w)", r"\1 \2"'
data=data:gsub(s, '')

I guess some escaping needs to be done. Any easy solution?


Update:

local data = [["(\.)\n(\w)", r"\1 \2"
"\n[^\t]", "", x, re.S]]

local s = [["(\.)\n(\w)", r"\1 \2"]]

local function esc(x)
   return (x:gsub('%%', '%%%%')
            :gsub('^%^', '%%^')
            :gsub('%$$', '%%$')
            :gsub('%(', '%%(')
            :gsub('%)', '%%)')
            :gsub('%.', '%%.')
            :gsub('%[', '%%[')
            :gsub('%]', '%%]')
            :gsub('%*', '%%*')
            :gsub('%+', '%%+')
            :gsub('%-', '%%-')
            :gsub('%?', '%%?'))
end

print(data:gsub(esc(s), ''))

This seems to works fine, only that I need to escape, escape character %, as it wont work if % is in matched string. I tried :gsub('%%', '%%%%') or :gsub('\%', '\%\%') but it doesn't work.


Update 2:

OK, % can be escaped this way if set first in above "table" which I just corrected

:terrible experience:

Update 3:

Escaping of ^ and $

As stated in Lua manual (5.1, 5.2, 5.3)

A caret ^ at the beginning of a pattern anchors the match at the beginning of the subject string. A $ at the end of a pattern anchors the match at the end of the subject string. At other positions, ^ and $ have no special meaning and represent themselves.

So a better idea would be to escape ^ and $ only when they are found (respectively) and the beginning or the end of the string.

Lua 5.1 - 5.2+ incompatibilities

string.gsub now raises an error if the replacement string contains a % followed by a character other than the permitted % or digit.

There is no need to double every % in the replacement string. See lua-users.

Upvotes: 14

Views: 32866

Answers (5)

jcarballo
jcarballo

Reputation: 29113

Use stringx.replace() from Penlight Lua Libraries instead.

Reference: https://stevedonovan.github.io/Penlight/api/libraries/pl.stringx.html#replace

Implementation (v1.12.0): https://github.com/lunarmodules/Penlight/blob/1.12.0/lua/pl/stringx.lua#L288

Based on their implementation:

function escape(s)
    return (s:gsub('[%-%.%+%[%]%(%)%$%^%%%?%*]','%%%1'))
end

function replace(s,old,new,n)
    return (gsub(s,escape(old),new:gsub('%%','%%%%'),n))
end

Upvotes: 1

FSMaxB
FSMaxB

Reputation: 2490

According to Programming in Lua:

The character `%´ works as an escape for those magic characters. So, '%.' matches a dot; '%%' matches the character `%´ itself. You can use the escape `%´ not only for the magic characters, but also for all other non-alphanumeric characters. When in doubt, play safe and put an escape.

Doesn't this mean that you can simply put % in front of every non alphanumeric character and be fine. This would also be future proof (in the case that new special characters are introduced). Like this:

function escape_pattern(text)
    return text:gsub("([^%w])", "%%%1")
end

It worked for me on Lua 5.3.2 (only rudimentary testing was performed). Not sure if it will work with older versions.

Upvotes: 24

Why not:

local quotepattern = '(['..("%^$().[]*+-?"):gsub("(.)", "%%%1")..'])'
string.quote = function(str)
    return str:gsub(quotepattern, "%%%1")
end

to escape and then gsub it away?

Upvotes: 7

Mike Corcoran
Mike Corcoran

Reputation: 14565

try

line = '"(\.)\n(\w)", r"\1 \2"'
rx =  '\"%(%\.%)%\n%(%\w%)\", r\"%\1 %\2\"'
print(string.gsub(line, rx, ""))

escape special characters with %, and quotes with \

Upvotes: 3

lhf
lhf

Reputation: 72312

Try s=[["(\.)\n(\w)", r"\1 \2"]].

Upvotes: 2

Related Questions