C. Reid
C. Reid

Reputation: 33

Remove middle of filenames

I have a list of filenames like this in bash

UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz

And I want them to look like this

UTSHoS10_R1.fq.gz
UTSHoS10_R2.fq.gz
UTSHoS11_R1.fq.gz 
UTSHoS11_R2.fq.gz
UTSHoS12_R1.fq.gz
UTSHoS12_R2.fq.gz

I do not have the perl rename command and sed 's/_Other*160418./_/' *.gz is not doing anything. I've tried other rename scripts on here but either nothing occurs or my shell starts printing huge amounts of code to the console and freezes.

This post (Removing Middle of Filename) is similar however the answers given do not explain what specific parts of the command are doing so I could not apply it to my problem.

Upvotes: 3

Views: 2617

Answers (5)

Nar_sys
Nar_sys

Reputation: 1

for i in *.gz; do mv "$i" "${i%O*}${i#*.}"; done

# explanation
${i%O}  captures the part of file.names before the first O
${i#*.} captures the part of file.name after first dot


$ls -v -1
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R1.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz

$ls -v -1
UTSHoS10_R1.fq.gz
UTSHoS10_R2.fq.gz
UTSHoS11_R1.fq.gz
UTSHoS11_R2.fq.gz
UTSHoS12_R1.fq.gz
UTSHoS12_R2.fq.gz

Upvotes: -2

codeforester
codeforester

Reputation: 42999

Pure Bash, using substring operation and assuming that all file names have the same length:

for file in UTS*.gz; do
  echo mv -i "$file" "${file:0:9}${file:38:8}"
done

Outputs:

mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS10_R1.fq.gz
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS10_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz UTSHoS12_R1.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz UTSHoS12_R2.fq.gz

Once verified, remove echo from the line inside the loop and run again.

Upvotes: 1

mklement0
mklement0

Reputation: 437090

Parameter expansions in bash can perform string substitutions based on glob-like patterns, which allows for a more efficient solution than calling an extra external utility such as sed in each loop iteration:

for f in *.gz; do echo mv "$f" "${f/_Other_*-TTAGGA_R_160418./_}"; done

Remove the echo before mv to perform actual renaming.

Upvotes: 5

Arjun Mathew Dan
Arjun Mathew Dan

Reputation: 5298

You can do something like this in the directory which contains the files to be renamed:

for file_name in *.gz
do 
  new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name");
  mv "$file_name" "$new_file_name";
done

The pattern (_[^.]*\.) starts matching from the FIRST _ till the FIRST . (both inclusive). [^.]* means 0 or more non-dot (or non-period) characters.

Example:

AMD$ ls
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz  UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz  UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz

AMD$ for file_name in *.gz
> do new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name")
> mv "$file_name" "$new_file_name"
> done

AMD$ ls
UTSHoS10_R1.fq.gz  UTSHoS10_R2.fq.gz  UTSHoS11_R2.fq.gz  UTSHoS12_R1.fq.gz  UTSHoS12_R2.fq.gz

Upvotes: 2

user707650
user707650

Reputation:

Going with your sed command, this can work as a bash one-liner:

for name in UTSH*fq.gz; do newname=$(echo $name | sed 's/_Other.*160418\./_/'); echo mv $name $newname; done

Notes:

  • I've adjusted your sed command: it had an * without a preceeding . (sed takes a regular expression, not a globbing pattern). Similarly, the dot needs escaping.
  • To see if it works, without actually renaming the files, I've left the echo command in. Easy to remove just that to make it functional.
  • It doesn't have to be a one-liner, obviously. But sometimes, that makes editing and browsing your command-line history easier.

Upvotes: 0

Related Questions