Peter Chung
Peter Chung

Reputation: 1122

linux merge multiple files in picard

I have ten directories, and each directory has around 10-12 bam files. I need to use picard package to merge them together and I want to find a way to do it better.

basic command:
java -jar picard.jar MergeSamFiles \
  I=input_1.bam \
  I=input_2.bam \
  O=merged_files.bam

directory 1:
java -jar picard.jar MergeSamFiles \
  I=input_16.bam \
  I=input_28.bam \
  I=input_81.bam \
  I=input_34.bam \
  ... \
  ... \
  I=input_10.bam \
  O=merged_files.bam

directory 2:
java -jar picard.jar MergeSamFiles \
  I=input_44.bam \
  I=input_65.bam \
  I=input_181.bam \
  I=input_384.bam \
  ... \
  ... \
  I=input_150.bam \
  O=merged_files.bam

How can I add the Input by using variable if they are not in sequential, and I would like to do the for loop of those ten directories but they contain different number of bam files.

Should I use python or R to do it or keep on using shell script ? Please advice.

Upvotes: 0

Views: 769

Answers (1)

Niema Moshiri
Niema Moshiri

Reputation: 937

Why not use samtools?

for folder in my_bam_folders/*; do
    samtools merge $folder.bam $folder/*.bam
done

In general, samtools merge can merge all the bam files in a given directory like this:

samtools merge merged.bam *.bam

EDIT: If samtools isn't an option and you have to use Picard, what about something like this?

for folder in my_bam_folders/*; do
    bamlist=$(for f in $folder/*.bam; do echo -n "I=$f " ; done)
    java -jar picard.jar MergeSamFiles $bamlist O=$folder.bam
done

Upvotes: 2

Related Questions