Reputation: 709
Given s = "AAABBC", can we extract the first series of same character using pattern matching in Lua? "AAA" is what I am expecting to get.
Here's what I am thinking.
local s = "AAABBC"
print(s:match("([A-Z])%1*"))
But it returns nil.
Please help! Thanks.
Upvotes: 3
Views: 704
Reputation: 626738
In-pattern backreferences in Lua patterns are not supported, you may want to use some external regex library that supports those contructs, like PCRE.
Egor Skriptunoff suggests a work around that uses a null char as a temporary marker inside the string between groups of the same letters:
s:gsub("[A-Z]", "\0%0%0"):gsub("(.)%z%1", "%1"):match"%z.([A-Z]+)"
For AAABBC
string, Egor's solution does the following:
gsub("[A-Z]", "\0%0%0")
- doubles each uppercase letter inserting a null before each sequence (AAABBC
=> _AA_AA_AA_BB_BB_CC
where _
repesents a null char) (demo
)gsub("(.)%z%1", "%1")
- replaces each char with a null followed with exactly the same char as before null with this same char (_AA_AA_AA_BB_BB_CC
=> _AAAA_BBB_CC
) (see demo)match"%z.([A-Z]+)")
matches the first occurrence of the null char, then any char, and then captures into Group 1 (and that is the value returned with string.match
) any 1+ uppercase letters (see demo).Upvotes: 1