ParanoidOrganoid
ParanoidOrganoid

Reputation: 45

bash script to create directories named after file names then move and rename files to corresponding directories

I looked all day from tutorials to ready made bash scripts, but couldn't find what I need. Also I never wrote a bash script before... :/

Here's my problem (to quote from full metal jacket: "there are many like it, but this one is mine"): I have a directory with 120 files that come in triplets (in bold I marked what's unique to every file triplet):

GSM6025613_SJWLM030043_D1_barcodes.tsv
GSM6025613_SJWLM030043_D1_features.tsv
GSM6025613_SJWLM030043_D1_matrix.mtx

GSM6025615_SJWLM071563_D1_barcodes.tsv
GSM6025615_SJWLM071563_D1_features.tsv
GSM6025615_SJWLM071563_D1_matrix.mtx

...

I would like to have a bash script that creates new directories with names according to the triplets unique strings (f.e. '613', '615') moves the three corresponding files (*barcode.tsv, *features.tsv, *matrix.mtx) to the corresponding directory and renames them so that the files are just called "barcode.tsv", "features.tsv" and "matrix.mtx".

I tried it with a for loop, but I'm pretty blank on bash scripting and beyond the second line, honestly, it's all jibberish to me (for now)... :/

`#!/bin/bash
for f in *.{tsv,mtx}
do
...
done`

I appreciate any help!

Upvotes: 0

Views: 336

Answers (3)

Dudi Boy
Dudi Boy

Reputation: 4865

Suggesting one line awk script:

awk '{d=substr($1,8);printf("mkdir -p %s;mv %-38s %s\n",d,$0,d"/"$4);}' FS="_" <<< $(ls *.csv *.mtx)

If the output commands are satisfactory, run the output:

bash <<< $(awk '{d=substr($1,8);printf("mkdir -p %s;mv %-38s %s\n",d,$0,d"/"$4);}' FS="_" <<< $(ls *.csv *.mtx))

awk script explanation:

BEGIN{FS="_"} # set awk field seperator to "_"
{ # for each file name input
  dir=substr($1,8); # extract dir name from 1st field
  printf("mkdir -p %s;mv %-38s %s\n", # format output command with printf
    dir, # 1st argument dir value
    $0,  # 2nd argument unchanged input file name
    dir"/"$4); # 3rd argument dir appended with 4th field
}

Upvotes: 0

pjh
pjh

Reputation: 8084

Try this Shellcheck-clean code (maybe on a copy of your directory first!):

#! /bin/bash -p

shopt -s nullglob

for file in GSM6025[0-9][0-9][0-9]_*_{features.tsv,matrix.mtx,barcodes.tsv}
do 
    dir=${file#GSM6025}
    dir=${dir%%_*}

    newfile=${file##*_}

    mkdir -p -v -- "$dir"
    mv -n -v -- "$file" "$dir/$newfile"
done

Upvotes: 1

Diego Torres Milano
Diego Torres Milano

Reputation: 69208

You can use sed to split the file name into its components, something like this

file='GSM6025615_SJWLM071563_D1_barcodes.tsv'
eval $(sed -E 's/(GSM....)([0-9]+)_(.*)_(.*)_(.*)\.tsv/n=\2;f=\5;/' <<< "$file")
echo $n
615
echo $f
barcodes

eval allows you to take the output of sed (the n and f variable assignments) and convert it to variables.

Upvotes: 1

Related Questions