Reputation: 2189
I have some bam
files in my input directory and for each bam
file i want to calculate the number of mapped reads (using Samtools view
command) and print that number along with the name of the bam
file into a output file. Though it is working, i am not getting the output that i desired.
Here is how my code looks like
for file in input/*;
do
echo $file >> test.out;
samtools view -F 4 $file | wc -l >> output;
done
This works fine but the problem is it ouputs the name of the file and number of reads in different lines. Here is an example
sample_data/wgEncodeUwRepliSeqBg02esG1bAlnRep1.bam
1784867
sample_data/wgEncodeUwRepliSeqBg02esG2AlnRep1.bam
2280544
I tried to convert the new line characters to tab by doing this
for file in input/*;
do
echo $file >> output;
samtools view -F 4 $file | wc -l >> output;
tr '\n' '\t' < output > output2
done
Here is the output for the same
sample_data/wgEncodeUwRepliSeqBg02esG1bAlnRep1.bam 1784867 sample_data/wgEncodeUwRepliSeqBg02esG2AlnRep1.bam 2280544
How can now i insert the new line character after each line? For example
sample_data/wgEncodeUwRepliSeqBg02esG1bAlnRep1.bam 1784867
sample_data/wgEncodeUwRepliSeqBg02esG2AlnRep1.bam 2280544
Thanks
Upvotes: 1
Views: 84
Reputation: 531798
Just use a command substitution:
for file in input/*
do
printf '%s\t%d\n' "$file" "$(samtools view -F 4 $file | wc -l)"
done >> output
Upvotes: 1
Reputation: 679
You could get the desired output by writing everything in one line. Something like:
echo -e "$file\t$(samtools view -F 4 $file | wc -l)" >> output;
If you want to do it in two pieces, note that echo
has a -n
option to suppress trailing newlines, and -e
to interpret escapes like \t
, so you could do:
echo -ne "$file\t" >> $output
samtools view -F 4 $file | wc -l >> output
Writing what you want the first time is cleaner than trying to post-process your output.
Upvotes: 1
Reputation: 97
If the output of every file definitely consists of a filename and a number, I think you can easily change
tr '\n' '\t' < output > output2
to
tr '\n' '\t' < output | sed -r 's/([0-9]+\t)/\1\n/' > output2
It will match the number followed by a tab and add a new line character afterwards.
Upvotes: 1