hdjc90
hdjc90

Reputation: 87

Remove part of file names that are different for each file

this is my first question so apologies if it is dumb.

I am trying to rename a bunch (100+) of files:

fcon13_03_01_1.fastq.gz  
fcon17_01_02_2.fastq.gz  
fcon19_02_02_1.fastq.gz 

I need them to look like:

fcon13_1.fastq.gz  
fcon17_2.fastq.gz  
fcon19_1.fastq.gz 

I have tried rename and:

`for file in *.gz ; do mv $file ${file//_01_0/} ; done`

but this isn't specific enough. I have tried looking at numerous threads but have had no luck with finding a way of removing a set length string that is different characters within a file name.

Please help! Thanks!

Upvotes: 1

Views: 500

Answers (5)

Geralt
Geralt

Reputation: 36

That should be useful:

#!/bin/bash

ls fcon*.fastq.gz > list1

while read list1; do 
echo "$list1" | cut -c 1-6,13-; 
done <list1 >list2

paste list1 list2 | (
while read f1 f2; do
mv ${f1} ${f2}
done )

Upvotes: 0

Matias Barrios
Matias Barrios

Reputation: 5056

You can do it using find :

find . -type f -name '*.gz' -print0 | xargs -0 -I '@' bash -c 'mv $1 $( echo ${1} | sed -E "s/_[0-9]+_[0-9]+//g")' -- @

this will change your file names to :

./fcon13_1.fastq.gz
./fcon19_1.fastq.gz

Let me know if it helps.

You can do it in Python3 as well :

#!/usr/local/bin/python3
from pathlib import Path
import re
import os

for filename in Path('.').rglob('*.gz'):
    oldname=str(filename)
    newname=re.sub(r'_[0-9]+_[0-9]+', '', str(filename))
    os.rename(oldname,newname)

Upvotes: 0

drmrgd
drmrgd

Reputation: 733

What about the perl rename command:

    $ rename -n 's/(.*?)_\d+_\d+(_\d\.fastq.gz)/$1$2/' *gz
    'fcon13_03_01_1.fastq.gz' would be renamed to 'fcon13_1.fastq.gz'
    'fcon17_01_02_2.fastq.gz' would be renamed to 'fcon17_2.fastq.gz'
    'fcon19_02_02_1.fastq.gz' would be renamed to 'fcon19_1.fastq.gz'

You can use the -n option for testing; remove it to commit the filename change. Also, I'm not sure how specific you need the regex to be since I don't know how many varitions you'll have, and so you may need to tweak it a bit to be more / less specific.

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133600

Could you please try following (written and tested with GNU sed).

for file in *.gz
do
   new_file_name=$(echo "$file" | sed -E 's/([^_]*)_[0-9]+_[0-9]+(.*)/\1\2/')
   echo "mv $file $new_file_name"
done

Above will only print the command of rename on your screen, once you are happy with results then you can run following command.

for file in *.gz
do
   new_file_name=$(echo "$file" | sed -E 's/([^_]*)_[0-9]+_[0-9]+(.*)/\1\2/')
   mv "$file" "$new_file_name"
done

Upvotes: 1

oguz ismail
oguz ismail

Reputation: 50775

for f in ./*.gz; do
  echo mv "$f" "${f%%_*}_${f##*_}"
done

Remove echo if the output looks ok.


  • ${f%%_*} removes the longest suffix starting with an underscore,
  • and ${f##*_} removes the longest prefix ending with an underscore.

Upvotes: 1

Related Questions