Reputation: 9357
I have 4 files extensions as result of previous works, stored in the $SEARCH array, as follows :
declare -a SEARCH=("toggled" "jtr" "jtr.toggled" "cupp")
I want to issue one file list for each of the 4 above extension patterns, as follows, except for the case with 2 dots and 2 extensions (marked "NO") :
################################################################################
1 - SEARCH FOR toggled in /media
regex : ([^\/]+)(\.)(toggled)$
command : find /media -type f | grep --color -P ([^\/]+)(\.)(toggled)$
################################################################################
/media/myfile_1.jtr.toggled --> NO
/media/myfile_1.toggled
/media/myfile_2.jtr.toggled --> NO
/media/myfile_2.toggled
/media/myfile_3.jtr.toggled --> NO
/media/myfile_3.toggled
################################################################################
2 - SEARCH FOR jtr in /media
regex : ([^\/]+)(\.)(jtr)$
command : find /media -type f | grep --color -P ([^\/]+)(\.)(jtr)$
################################################################################
/media/myfile_1.jtr
/media/myfile_2.jtr
/media/myfile_3.jtr
################################################################################
3 - SEARCH FOR jtr.toggled in /media
regex : ([^\/]+)(\.)(jtr.toggled)$
command : find /media -type f | grep --color -P ([^\/]+)(\.)(jtr.toggled)$
################################################################################
/media/myfile_1.jtr.toggled
/media/myfile_2.jtr.toggled
/media/myfile_3.jtr.toggled
################################################################################
4 - SEARCH FOR cupp in /media
regex : ([^\/]+)(\.)(cupp)$
command : find /media -type f | grep --color -P ([^\/]+)(\.)(cupp)$
################################################################################
/media/myfile_1.cupp
/media/myfile_2.cupp
/media/myfile_3.cupp
Obviously I spent hours on regex101 w/o success. I also tried to achieve my target with other methods, which does not fit with the rest of the code.
Here is a code extract :
for ext in "${SEARCH[@]}"
do
COUNTi=$((COUNTi+1))
REGEX="([^\/]+)(\.)("$ext")$" #
# Ideally, the Regex should come from a pattern array
printf '%*s' "$len" | tr ' ' "$mychar"
echo -e "\n$COUNTi - SEARCH FOR $ext in $BASEDIR"
echo "regex : $REGEX"
echo "command : find $BASEDIR -type f | grep --color -P $REGEX"
printf '%*s' "$len" | tr ' ' "$mychar" && echo
find $BASEDIR -type f | grep --color -P $REGEX
# the Regex caveats as the double dot extensions are not parsed correctly.
echo -e "\n"
done
So my 2 questions related to the same piece of code :
REGEX : what would be a correct regex, to be able to parse and dump the files by extension family (pls see the 4 SEARCH patterns and related dumps) ?
ARRAYS : Once above point is solved, how to use a pattern array data, containing the $extension placeholder, into the looped REGEX ?
PATTERN+=( "([^\/]+)(\.)($ext)$" )
# All of these below : CAVEATS escaping $ or not...
# REGEX=${PATTERN[5]}
# REGEX=$(eval "${PATTERN[5]}" )
# echo "pattern : ${PATTERN[5]}"
# eval "$REGEX=\$REGEX"
# eval "$REGEX=\"\$REGEX\""
# REGEX=$(echo "${REGEX}")
# REGEX=${!PATTERN[5]}
Notes:
I read all regex documentations for hours, tried hundreds of regex patterns, w/o success as I can't understand these regex rationales.
I also tried other ways, for example find / -name "sayONEnameinmysearchpattern" ! -iname "theothernamesfromtehsearchpattern"
. This is not what I'm looking for.
Thx
Upvotes: 0
Views: 93
Reputation: 66
Change the REGEX line in your code to:
REGEX='^(.*\/|)[^\/\.]+\.'"$ext\$"
The perl regular expression to match the basename of the file is in single quotes. This prevents the shell from trying to expand it. The $ext is in double quotes, so it will be expanded by the shell. The trailing $ is escaped with a backslash just for form.
The leading ^(.*/|) will match a leading directory (ending with /), the [^/\.]+ will match one or more characters that are NOT '.' or '/'. That must then be followed by a '.' and your extension, followed by the end of the file name ($) to match.
The key here is to anchor your match at both ends (^ and $) and not allow any dots '.' except the ones you really want.
You also might want to put $REGEX in quotes... "$REGEX" in the grep command near the end of your code extract.
Upvotes: 2