Harry
Harry

Reputation: 709

Lua pattern matching - Series of same character

Given s = "AAABBC", can we extract the first series of same character using pattern matching in Lua? "AAA" is what I am expecting to get.

Here's what I am thinking.

local s = "AAABBC"
print(s:match("([A-Z])%1*"))

But it returns nil.

Please help! Thanks.

Upvotes: 3

Views: 704

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

In-pattern backreferences in Lua patterns are not supported, you may want to use some external regex library that supports those contructs, like PCRE.

Egor Skriptunoff suggests a work around that uses a null char as a temporary marker inside the string between groups of the same letters:

s:gsub("[A-Z]", "\0%0%0"):gsub("(.)%z%1", "%1"):match"%z.([A-Z]+)"

For AAABBC string, Egor's solution does the following:

  • gsub("[A-Z]", "\0%0%0") - doubles each uppercase letter inserting a null before each sequence (AAABBC => _AA_AA_AA_BB_BB_CC where _ repesents a null char) (demo )
  • gsub("(.)%z%1", "%1") - replaces each char with a null followed with exactly the same char as before null with this same char (_AA_AA_AA_BB_BB_CC => _AAAA_BBB_CC) (see demo)
  • match"%z.([A-Z]+)") matches the first occurrence of the null char, then any char, and then captures into Group 1 (and that is the value returned with string.match) any 1+ uppercase letters (see demo).

Upvotes: 1

Related Questions