user2704576
user2704576

Reputation: 61

Lua string manipulation pattern matching alternative "|"

Is there a way I can do a string pattern that will match "ab|cd" so it matches for either "ab" or "cd" in the input string. I know you use something like "[ab]" as a pattern and it will match for either "a" or "b", but that only works for one letter stuff.

Note that my actual problem is a lot more complicated, but essentially I just need to know if there is an OR thing in Lua's string manipulation. I would actually want to put other patterns on each sides of the OR thing, and etc. But if it works with something like "hello|world" and matches "hello, world!" with both "hello" and "world" then it's great!

Upvotes: 7

Views: 2836

Answers (3)

Yu Hao
Yu Hao

Reputation: 122493

Using logical operator with Lua patterns can solve most problems. For instance, for the regular expression [hello|world]%d+, you can use

string.match(str, "hello%d+") or string.match(str, "world%d+")

The shortcut circuit of or operator makes sure the string matches hello%d+ first, if if fails, then matches world%d+

Upvotes: 5

greatwolf
greatwolf

Reputation: 20878

Just to expand on peterm's suggestion, lpeg also provides a re module that exposes a similar interface to lua's standard string library while still preserving the extra power and flexibility offered by lpeg.

I would say try out the re module first since its syntax is a bit less esoteric compared to lpeg. Here's an example usage that can match your hello world example:

dump = require 'pl.pretty'.dump
re = require 're'


local subj = "hello, world! padding world1 !hello hello hellonomatch nohello"
pat = re.compile [[
  toks  <-  tok (%W+ tok)*
  tok   <-  {'hello' / 'world'} !%w / %w+
]]

res = { re.match(subj, pat) }
dump(res)

which would output:

{
  "hello",
  "world",
  "hello",
  "hello"
}

If you're interested in capturing the position of the matches just modify the grammar slightly for positional capture:

tok   <-  {}('hello' / 'world') !%w / %w+

Upvotes: 3

Unfortunately Lua patterns are not regular expressions and are less powerful. In particular they don't support alternation (that vertical bar | operator of Java or Perl regular expressions), which is what you want to do.

A simple workaround could be the following:

local function MatchAny( str, pattern_list )
    for _, pattern in ipairs( pattern_list ) do
        local w = string.match( str, pattern )
        if w then return w end
    end
end


s = "hello dolly!"
print( MatchAny( s, { "hello", "world", "%d+" } ) )

s = "cruel world!"
print( MatchAny( s, { "hello", "world", "%d+" } ) )

s = "hello world!"
print( MatchAny( s, { "hello", "world", "%d+" } ) )

s = "got 1000 bucks"
print( MatchAny( s, { "hello", "world", "%d+" } ) )

Output:

hello
world
hello
1000

The function MatchAny will match its first argument (a string) against a list of Lua patterns and return the result of the first successful match.

Upvotes: 4

Related Questions