Reputation: 133
I'm trying to make sure the input to my shell script follows the format Name_Major_Minor.extension
where Name is any number of digits/characters/"-" followed by "_"
Major is any number of digits followed by "_"
Minor is any number of digits followed by "."
and Extension is any number of characters followed by the end of the file name.
I'm fairly certain my regular expression is just messed up slightly. any file I currently run through it evaluates to "yes" but if I add "[A-Z]$" instead of "*$" it always evaluates to "no". Regular expressions confuse the hell out of me as you can probably tell..
if echo $1 | egrep -q [A-Z0-9-]+_[0-9]+_[0-9]+\.*$
then
echo "yes"
else
echo "nope"
exit
fi
edit: realized I am missing the pattern for "minor". Still doesn't work after adding it though.
Upvotes: 2
Views: 1630
Reputation: 21492
Bash supports regular expression matching through its =~
operator, and there is no need for egrep
in this particular case:
if [[ "$1" =~ ^[A-Za-z0-9-]+_[0-9]+_[0-9]+\..*$ ]]
The \.*$
sequence in your regular expression means "zero or more dots". You probably meant "a dot and some characters after it", i.e. \..*$
.
Your regular expression matches only the end of the string ($
). You likely want to match the whole string. To match the entire string, use the ^
anchor to match the beginning of the line.
If you still want to use egrep
, you should escape its arguments as you should escape any command line arguments to avoid reinterpretation of special characters, or rather wrap the argument in single, or double quotes, e.g.:
if echo "$1" | egrep -q '^[A-Za-z0-9-]+_[0-9]+_[0-9]+\..*$'
Don't use echo
, as its behavior is considered unreliable. Use printf
instead:
printf '%s\n' "$1"
Upvotes: 4
Reputation: 7081
Try this regex instead: ^[A-Za-z0-9-]+(?:_[0-9]+){2}\..+$
.
[A-Za-z0-9-]+
matches Name
_[0-9]+
matches _
followed by one or more digits(?:...){2}
matches the group two times: _Major_Minor
\..+
matches a period followed by one or more characterThe problem in your regex seems to be at the end with \.*
, which matches a period \.
any number of times, see here. Also the [A-Z0-9-]
will only match uppercase letters, might not be what you wanted.
Upvotes: 1