AndriusWild
AndriusWild

Reputation: 81

Extract certain part of filename in Bash

I have many files in a folder:

yyyymmdd_hhmmss.mp4
yyyymmdd_hhmmss_suffix1.mp4
yyyymmdd_hhmmss_suffix1_suffix2.mp4

The following filename formats are also possible (rarely):

yyyymmdd_hhmmss_$$$.mp4
yyyymmdd_hhmmss_$$$_suffix1.mp4
yyyymmdd_hhmmss_$$$_suffix1_suffix2.mp4
yyyymmdd_hhmmss_$$.mp4
yyyymmdd_hhmmss_$$_suffix1.mp4
yyyymmdd_hhmmss_$$_suffix1_suffix2.mp4
yyyymmdd_hhmmss_$.mp4
yyyymmdd_hhmmss_$_suffix1.mp4
yyyymmdd_hhmmss_$_suffix1_suffix2.mp4

where $ is a number 0-9

I am trying to catch "yyyymmdd_hhmmss" and use it as an argument. This is what I do when only one suffix presented:

for file in "$@"; do 
  file_nosuffix="${file%*_suffix1.mp4}.mp4"
  echo "$file and $file_nosuffix"
done

But I get lost when all sorts of the filename formats mentioned above are presented. Ideally I would like to stick to the current pattern:

for file in "$@"; do 
   #catch "yyyymmdd_hhmmss"
   #do something on files yyyymmdd_hhmmss.mp4
   #do something else on files yyyymmdd_hhmmss_suffix1.mp4
   #etc.
done

Is that possible?

Upvotes: 1

Views: 145

Answers (1)

Charles Duffy
Charles Duffy

Reputation: 295736

Bash has built-in regex support, if you want to confirm the format:

regex='^[[:digit:]]{8}_[[:digit:]]{6}' # POSIX ERE; can't use PCRE extensions here

for file; do
  if [[ $file =~ $regex ]]; then
    echo "${BASH_REMATCH[0]} is the substring for $file" >&2
  else
    echo "$file does match the required format" >&2
  fi
done

One can also trivially take a prefix;

for file; do
  prefix=${file:0:15}
  echo "Prefix for $file is $prefix"
done

...or, to delete the last two underscores and everything after them:

prefix=${file%_*_*}

See:

Upvotes: 4

Related Questions