user9526185
user9526185

Reputation:

Identifying either of two different substrings in the filename using Regex?

When I run my script, I specify a pattern I'm looking for in the files located in the source directory. Pattern can be located anywhere in the filename.

When I do:

sh packageScript.sh -p ".*TOM.*

The script works as desired and all the files with "TOM" in the name are packaged up.

But if I want the script to package up files with "TOM" or "JER" in the name, the script fails. I tried each of the following:

sh packageScript.sh -p ".*TOM.*||.*_JER_.*"
sh packageScript.sh -p ".*TOM.*|.*_JER_.*"

The for loop that iterates over the files in my script:

for file in $(find -regex "$PATTERN" -type f);
do 
 //things get done here
done

(I assign the value of the -p flag to $PATTERN in a "while getopts" at the top of my script)

Sample file names:

M_V_CHUCK_TOM_20180105.txt
M_V_CHUCK_TOM_20170105.txt
M_V_CHUCK_TOM_20160105.txt
M_V_JONES_OUT_20180105.txt
M_V_JONES_OUT_20170105.txt
M_V_JONES_OUT_20160105.txt

EDIT: JER was corrected to _JER_, as is the requirement

Upvotes: 0

Views: 205

Answers (2)

user9526185
user9526185

Reputation:

I escaped the | and that seems to have done the trick. Final command: ".*TOM.*\|.*_JER_.*"

Upvotes: 0

anubhava
anubhava

Reputation: 785156

Change your loop with find to this:

while IFS= read -d '' -r file; do 
   //things get done here
   echo "$file"
done < <(find . -type f -regextype posix-egrep -regex ".*($PATTERN).*" -print0)
  • This script uses gnu find's extended regex feature with -regextype option.
  • We are also using bash's process substitution

Finally call your code as:

bash packageScript.sh -p 'TOM|JER'

Upvotes: 1

Related Questions