jiii
jiii

Reputation: 61

using regex to iterate over files that matches a certain pattern in bash scripts

I have a regrex pattern ([0-9]*\.)+[0-9]*_to_([0-9]*\.)+[0-9]*, a matching string would be 11.1.1.1_to_21.1.1.1. I want to discover all files under a directory with the above pattern.

However I am not able to get it correctly using the code below. I tried to escape ( and ) by adding \ before them, but that did not work.

dir=$SCRIPT_PATH/oaa_partition/upgrade/([0-9]*\.)+[0-9]*_to_([0-9]*\.)+[0-9]*.sql 
for FILE in $dir; do
   echo $FILE
done

I was only able to something like this

dir=$SCRIPT_PATH/oaa_partition/upgrade/[0-9]*_to_*.sql
for FILE in $dir; do
   echo $FILE
done

Need some help on how to use the full regrex pattern ([0-9]*\.)+[0-9]*_to_([0-9]*\.)+[0-9]* here.

Upvotes: 2

Views: 2147

Answers (3)

anubhava
anubhava

Reputation: 785246

You cannot use regular expression in for loop. It only supports glob patterns and that is not as robust as a regex.

You will have to use your regex in gnu-find command as:

find . -mindepth 1 -maxdepth 1 -regextype egrep -regex '.*/([0-9]*\.)+[0-9]*_to_([0-9]*\.)+[0-9]*\.sql'

To loop these entries:

while IFS= read -rd '' file; do
   echo "$file"
done < <(find . -mindepth 1 -maxdepth 1 -regextype egrep -regex '.*/([0-9]*\.)+[0-9]*_to_([0-9]*\.)+[0-9]*\.sql')

Upvotes: 1

Fravadona
Fravadona

Reputation: 17055

Your regex is simple enough for replacing it with a bash extglob

#!/bin/bash
shopt -s extglob

glob='+(*([0-9]).)*([0-9])_to_+(*([0-9]).)*([0-9]).sql'

for file in "$SCRIPT_PATH"/oaa_partition/upgrade/$glob
do 
    printf '%q\n' "$file"
done

If the regex is too complex for translating it to extended globs then you can filter the files using a bash regex inside the for loop:

#!/bin/bash

regex='([0-9]*\.)+[0-9]*_to_([0-9]*\.)+[0-9]*\.sql'

for file in "$SCRIPT_PATH"/oaa_partition/upgrade/*_to_*.sql
do
    [[ $file =~ /$regex\.sql$ ]] || continue
    printf '%q\n' "$file"
done

BTW, as it is, your regex could match a lot of unwanted things, for example: 0._to_..sql.
If this is enough for differentiating the targeted files from the others then you can probably just use the basic glob [0-9]*_to_[0-9]*.sql

Upvotes: 2

Randommm
Randommm

Reputation: 621

To fix the regex you would want to match at least 1 number before the dot, and if you go with it, a literal dot before the sql

([0-9]+\.)+[0-9]*_to_([0-9]+\.)+[0-9]*\.sql

https://regex101.com/r/5xB3Bt/1

Upvotes: 0

Related Questions