Reputation: 790
I have the following commands (below) which I like to make a bit smarter in two aspects:
Get the for statement shorter, something like:
for i in seq `1 22` X;
Would that work?
And getting the awk statement a bit smarter. Something like:
awk '{print $1,$2,'$i',$4-$10,$12-$21}'
That will subtract the value of column 10 from 4, and 21 from 12. I want it to print 4 through 10, etc. How do I do that?
Thanks a lot!
Sander
Original commands are below
grep 'alternate_ids' 1000g/aegscombo_pp_1000G_sum_stat_chrX.out > 1000g/aegscombo_pp_1000G_sum_stat_allchr.txt
for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X;
do
echo "Grepping data for chromosome: "$i
tail -n +13 1000g/aegscombo_pp_1000G_sum_stat_chr$i.out | wc -l
tail -n +13 1000g/aegscombo_pp_1000G_sum_stat_chr$i.out |
awk '{print $1,$2,'$i',$4,$5,$6,$7,$8,$9,$10,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21}' \
>> 1000g/aegscombo_pp_1000G_sum_stat_allchr.txt
done
Upvotes: 0
Views: 323
Reputation: 203985
Any time you write a loop in shell just to manipulate text you have the wrong approach. The shell is just an environment from which to call tools and the UNIX tool for general purpose text processing is awk. Your script should look something like this:
awk '
BEGIN {
for (i=1; i<=22; i++) {
ARGV[ARGC++] = "1000g/aegscombo_pp_1000G_sum_stat_chr" i ".out"
}
ARGV[ARGC++] = "1000g/aegscombo_pp_1000G_sum_stat_chrX.out"
}
NR == FNR {
if (/alternate_ids/) {
print
}
next
}
FNR == 1{
chr = FILENAME
gsub(/^.*chr|\.out$/,"",chr)
print "Grepping data for chromosome:", chr | "cat>&2"
}
{
for (i=1; i<=21; i++) {
printf "%s%s", (i==3?chr:$i), (i<21?OFS:ORS)
}
}
' 1000g/aegscombo_pp_1000G_sum_stat_chrX.out > 1000g/aegscombo_pp_1000G_sum_stat_allchr.txt
Upvotes: 1
Reputation: 80992
for i in {1..22} X; do
If the number of fields to not print is smaller than the number of fields to print you could try emptying the fields you want to ignore and then print the whole line.
Upvotes: 1