Reputation: 65
I am expecting strings which has month prefixs like JAN, FEB , MAR...
My regex till now ...(J[AU][NL]|FEB|MA[RY]|APR|AUG|SEP|OCT|NOV|DEC)...
Can you guys go any shorter or is there any less ugly alternative??
Thanks
Upvotes: 0
Views: 156
Reputation: 13252
The less ugly, and far more efficient, alternative is to use the in
operator from expr
.
expr {$month in {JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC}}
or
if {$month in {JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC}} {
...
}
This is an order of magnitude faster, clearer to look at, and you don't get any false positives.
As Donal Fellows notes, if one must use a regexp, it's better to use an explicit one ((JAN|FEB|…|NOV|DEC)
) since it's more clear. Now, I've never ventured into the regex engine source code to see how it works (nor would I unless one of my kids was lost in there), but I'm pretty sure that the recognition chains that the engine builds for this expression are at least as efficient as whatever clever abbreviation you or I could come up with.
Another thing: is there any chance you might want to internationalize the application? Abbreviated month names are the same in most countries (in the West, at least), but there are some differences. With Tcl it's very easy to get localized lists of abbreviated month names either by extracting them from clock
or by keeping your own lists and using the msgcat
package. If you create your regexp like this:
set re ([join [lmap m {0 1 2 3 4 5 6 7 8 9 10 11} {lindex [::msgcat::mc MONTHS_ABBREV] $m}] |])
and later someone wants to change the language of the application, you just re-create it. It's much harder to do this if you want to craft your own regular expressions as in your question above.
Upvotes: 3