Reputation: 5107
I need to come up with a pattern to match YYYY-MM-DDTHH:MM:SS.s+Z
with the milliseconds part being optional. The regex is simple and looks like this:
^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(.\d+)?Z$
Which match these strings:
"2022-04-02T11:24:59Z"
"2022-04-02T11:24:59.123Z"
In Lua, this isn't as straight forward as I thought. I've tried a couple of patterns but ultimately only got this one to work:
local pat3 = "^%d%d%d%d%-%d%d%-%d%dT%d%d:%d%d:%d%d[%.%d+]*Z$"
local dt1 = "2022-04-02T11:24:59Z"
local dt2 = "2022-04-02T11:24:59.123Z"
local dt_invalid = "2022-04-02T11:24:59.123.000.000Z"
print(dt1:match(pat3))
print(dt2:match(pat3))
print(dt_invalid:match(pat3))
That pattern meets most of my needs, but it's bothering me that strings like dt_invalid
match too. I've also tried the following patterns with no success:
local pat1 = "^%d%d%d%d%-%d%d%-%d%dT%d%d:%d%d:%d%d[%.%d+]?Z$"
local pat2 = "^%d%d%d%d%-%d%d%-%d%dT%d%d:%d%d:%d%d(%.%d+)?Z$"
Lua has a simplified pattern matching functionality, but these patterns look more like the regex pattern. I'm not knowledgeable enough in Lua to know the difference or what I'm missing. Why does pat1
and pat2
not work? Is there a better pattern than pat3
?
Upvotes: 1
Views: 172
Reputation: 2813
I strongly suggesting to open a Lua Standalone and train yourself.
A very good tool for me is string.gsub()
and every string has all string functions attached as methods.
That make things much easier...
> _VERSION
Lua 5.4
> ("2022-04-02T11:24:59.123Z"):gsub('^%d+%-%d+%-%d+%u%d+%:%d+%:%d+%.%d+%u$', 'MATCH ALL')
MATCH ALL 1
> ("2022-04-02T11:24:59.123Z"):gsub('^%d+%-%d+%-%d+%u%d+%:%d+%:%d+%.%d+%u$', 'Replaced with MATCH: %1')
Replaced with MATCH: 2022-04-02T11:24:59.123Z 1
> -- Lets replace "T" with a space
> ("2022-04-02T11:24:59.123Z"):gsub('T', ' ')
2022-04-02 11:24:59.123Z 1
> -- Cut off the last part
> ("2022-04-02T11:24:59.123Z"):gsub('%.%d+%u$', '')
2022-04-02T11:24:59 1
> -- Finally
> do local date, count = ("2022-04-02T11:24:59.123Z"):gsub('T', ' '):gsub('%.%d+%u$', '') print(date) end
2022-04-02 11:24:59
> -- Lets do a gsub() chain for all three cases
> do local date, count = ("2022-04-02T11:24:59.123Z 2022-04-02T11:24:59Z 2022-04-02T11:24:59.123.000.000Z"):gsub('T', ' '):gsub('%.%d+',''):gsub('%u', '') print(date) end
2022-04-02 11:24:59 2022-04-02 11:24:59 2022-04-02 11:24:59
Upvotes: 1
Reputation: 15492
The problem here is that in order for a set of characters to be "quantifiable" (eligible for a quantifier to be assigned to the set), you need to enclose the elements of the set between brackets.
In your pat1
case, the last %d
is not enclosed into brackets, so the +
is considered as a character instead of a quantifier. On the other hand, in your pat2
case, no quantifier will be considered at all.
Moreover in LUA you can't nest sets, so you can't specify a pattern like [%.[%d]+]?
, cause it would match only the inside quantifier while the ?
will be considered as a normal character.
My solution would be to use a workaround that may be less restrictive (potentially prone to match other strings) still that catches the parts of the time you need:
%d%d%d%d%-%d%d%-%d%dT%d%d:%d%d:%d%d[%.]?[%d]*Z
Vulnerabilities (lines that shouldn't match - which match though):
Does this help to your case within the whole set of strings you have?
Upvotes: 0