Sim
Sim

Reputation: 3

Extract prefix from multiple files and rename by adding as suffix

Currently multiple files are in a folder with name as :

yyyymmdd_TCT_XPL_PLA_Attribution.tab

Example:

20160301_TCT_XPL_PLA_Attribution.tab

I want to rename all files with naming convention as :

XPL_PNL_Attribution_2016-03-01

The file(s) have to be renamed to meet the general naming convention (XPL_PNL_Attribution_yyyy-mm-dd) :

1) The first particle should be the "class" which means the filename prefix, up to the first underscore, need to add a prefix (XPL_PNL Attribution_) to the filename

2) the date part of the filename should be yyyy-mm-dd, not yyyymmdd like it is here, need to add the dashes

Upvotes: 0

Views: 158

Answers (4)

SLePort
SLePort

Reputation: 15461

You can try this, using Bash parameter substitution :

for f in *.tab; do
    mv "${f}" XPL_PNL_Attribution_"${f:0:4}"-"${f:4:2}"-"${f:6:2}";
done

Upvotes: 1

Gilbert
Gilbert

Reputation: 3776

Ugly, undebugged, untested, but flexible on incoming file names, assuming the TCT field is always to be discarded.

# 20160301_TCT_XPL_PLA_Attribution.tab
# XPL_PNL_Attribution_2016-03-01
#                  1                 2            3                        4
file_out="$( echo "$file_in" | 
    sed -r -e 's/^\([12][0-9]{3}\)\([0-9][0-9]\)\([0-9][0-9]\)_[A-Za-z]+_\(.*\)\.tab/($4)_$1-$2-$3/'
   )";
   mv "$file_in" "$file_out"

Upvotes: 0

ghoti
ghoti

Reputation: 46846

You could do this by splitting the filename into fields, separated by underscores.

$ f=20160301_TCT_XPL_PLA_Attribution.tab
$ IFS=_
$ a=( ${f%.*} )
$ declare -p a
declare -a a='([0]="20160301" [1]="TCT" [2]="XPL" [3]="PLA" [4]="Attribution")'
$ new="$(printf '%s_%s_%s_%s' "${a[2]}" "${a[3]}" "${a[4]}" "${a[0]:0:4}-${a[0]:4:2}-${a[0]:6:2}")"

Scripting this to handle a directory of files requires a for loop:

IFS=_
for f in *.tab; do
  a=( ${f%.*} )
  new="$(printf '%s_%s_%s_%s' "${a[2]}" "${a[3]}" "${a[4]}" "${a[0]:0:4}-${a[0]:4:2}-${a[0]:6:2}")"
  mv -v "$f" "$new"
done

Alternately, you could gather the filename parts using a regex:

$ [[ $f =~ ^([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^.]+) ]]
$ declare -p BASH_REMATCH
declare -ar BASH_REMATCH='([0]="20160301_TCT_XPL_PLA_Attribution" [1]="20160301" [2]="TCT" [3]="XPL" [4]="PLA" [5]="Attribution")'

Scripting this would work the same way, only you'd refer to the $BASH_REMATCH[] array instead of $a[], and you wouldn't need to mess with $IFS.

for f in *.tab; do
  [[ $f =~ ^([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^.]+) ]]
  new="${BASH_REMATCH[3]}_${BASH_REMATCH[4]}_${BASH_REMATCH[5]}_${BASH_REMATCH[0]:0:4}-${BASH_REMATCH[0]:4:2}-${BASH_REMATCH[0]:6:2}"
  mv -v "$f" "$new"
done

Upvotes: 1

Ali ISSA
Ali ISSA

Reputation: 408

echo "20160301_TCT_XPL_PLA_Attribution.tab" | sed -e "s/^\(....\)\(..\)\(..\)_....\([^.]*\).\(.*\)$/\4_\1-\2-\3.\5/g" | sed -e "s/_PLA_/_PNL_/g"

or

echo "20160301_TCT_XPL_PLA_Attribution.tab" | sed -e "s/^\([0-9][0-9][0-9][0-9]\)\([0-9][0-9]\)\([0-9][0-9]\).*_\([^.]*\).\(.*\)$/XPL_PNL_\4_\1-\2-\3.\5/g"

Upvotes: 0

Related Questions