Reputation: 77
I am struggling to rename bunch of files with a variable pattern to be removed.
I have:
1B_ACTCGCTA-CCTAGAGT_L001_R1_001.fastq.gz
1B_ACTCGCTA-CCTAGAGT_L001_R2_001.fastq.gz
97C_TAAGGCGA-TTATGCGA_L001_R1_001.fastq.gz
97C_TAAGGCGA-TTATGCGA_L001_R2_001.fastq.gz
98A_S62_L001_R1_001.fastq.gz
98A_S62_L001_R2_001.fastq.gz
and want to have:
1B_R1_001.fastq.gz
1B_R2_001.fastq.gz
97C_R1_001.fastq.gz
97C_R2_001.fastq.gz
98A_R1_001.fastq.gz
98A_R2_001.fastq.gz
As you can see the pattern that needs to be dropped is variable and simple matching wont work. A logical workaround would be to exclude everything between the first and third underscore, or first underscore and letter "R". Unfortunately I am not able to come up with a code that would do that. It can be anything as long as it works, rename, bash for in loop, etc...
Appreciate your help, Deni
EDIT: I was trying to use for-loop but was not able to come up with a complete code to retain second part of the file name (everything that follows letter "R")
for file in *.fastq.gz; do echo mv "${file}" "${file/_*/\/}"; done
Upvotes: 1
Views: 386
Reputation: 971
The following should work:
for f in *.fastq.gz; do echo mv "$f" "${f%%_*}_${f#*_*_*_}"; done
I specifically added echo
before mv
, so it prints what it would move. If it prints correctly remove echo
and run again.
What happens here is I take head via %% and tail via # and concatenate them. See Parameter Expansion
in man bash
for meaning of %% and #. The solution relies on number of _ in file names being constant.
Upvotes: 2
Reputation: 207465
With (Perl) rename
:
rename --dry-run 's/_.*_R/_R/' *gz
Sample Output
'1B_ACTCGCTA-CCTAGAGT_L001_R1_001.fastq.gz' would be renamed to '1B_R1_001.fastq.gz'
'1B_ACTCGCTA-CCTAGAGT_L001_R2_001.fastq.gz' would be renamed to '1B_R2_001.fastq.gz'
'97C_TAAGGCGA-TTATGCGA_L001_R1_001.fastq.gz' would be renamed to '97C_R1_001.fastq.gz'
'97C_TAAGGCGA-TTATGCGA_L001_R2_001.fastq.gz' would be renamed to '97C_R2_001.fastq.gz'
'98A_S62_L001_R1_001.fastq.gz' would be renamed to '98A_R1_001.fastq.gz'
'98A_S62_L001_R2_001.fastq.gz' would be renamed to '98A_R2_001.fastq.gz'
Upvotes: 2
Reputation: 904
Answer which doesn't rely on number of underscores:
for file in $(ls); do
mv $file $(echo $file | awk -F _ 'BEGIN {OFS="_"} {print $1, $(NF-1), $NF}');
done
Upvotes: 1