Reputation: 607
I need a regular expression to match and extract groups for a file name that will have the following format:
<artifactName>-<version>-<classifier>.<extension>
Where:
<artifactName>
can have dashes in it<version>
must be of the format X
, X.Y
, X.X.Y
, or X.X.X.Y
, where X is any number of digits and Y is an alphanumeric string that can contain underscores<classifier>
must be one of the following formats:<datestring>b<buildNumber>_<branch>
<branch>
<datestring>
is a 14 digit number, <buildNumber>
is any number of digits, and <branch>
is any alphanumeric string that can contain dashes or periods<extension>
can be any alphanumeric string that can contain underscoresSo far I have this regular expression, which works in online regex testers, but it fails when tested in a bash script:
^(.+)-((?:[[:digit:]]+\.){0,3}(?:[[:digit:]]+))-((?:([0-9]{14})b([[:digit:]]+)_([^\.]*))|(?:[^\.]*))\.(.+)$
The script I am using looks like this:
FILE_NAME='some-artifact-1.2.3.4-20180911123456b123_branch.ex.ten.sion'
REGEX='^(.+)-((?:[[:digit:]]+\.){0,3}(?:[[:digit:]]+))-((?:([0-9]{14})b([[:digit:]]+)_([^\.]*))|(?:[^\.]*))\.(.+)$'
if [[ "${FILE_NAME}" =~ ${REGEX} ]]
then
echo "Artifact = ${BASH_REMATCH[1]}"
echo "Version = ${BASH_REMATCH[2]}"
echo "Classifier = ${BASH_REMATCH[3]}"
echo "Build Date = ${BASH_REMATCH[4]}"
echo "Build Number = ${BASH_REMATCH[5]}"
echo "Branch = ${BASH_REMATCH[6]}"
echo "Extension = ${BASH_REMATCH[7]}"
fi
I am assuming the interpreter that bash uses requires a little different syntax, but I cannot figure out how to convert the regular expression that works in the online testers into one that works in bash.
Upvotes: 1
Views: 2447
Reputation: 246827
Using shell parameter expansion: It's a bit verbose, but reliable.
FILE_NAME='some-artifact-1.2.3.4-20180911123456b123_branch.ex.ten.sion'
art_ver=${FILE_NAME%-*}
artifact=${art_ver%-*}
version=${art_ver##*-}
class_ext=${FILE_NAME##*-}
classification=${class_ext%%.*}
extension=${class_ext#*.}
printf "%s\n" "$artifact" "$version" "$classification" "$extension"
some-artifact
1.2.3.4
20180911123456b123_branch
ex.ten.sion
I just read your requirements more carefully: if the branch can contain dots and the extension can contain dots, it is impossible to determine where the branch stops and the extension begins.
Upvotes: 1