mirix
mirix

Reputation: 523

Add leading zeroes to awk variable

I have the following awk command within a "for" loop in bash:

awk -v pdb="$pdb" 'BEGIN {file = 1; filename = pdb"_" file ".pdb"}
 /ENDMDL/ {getline; file ++; filename = pdb"_" file ".pdb"}
 {print $0 > filename}' < ${pdb}.pdb 

This reads a series of files with the name $pdb.pdb and splits them in files called $pdb_1.pdb, $pdb_2.pdb, ..., $pdb_21.pdb, etc. However, I would like to produce files with names like $pdb_01.pdb, $pdb_02.pdb, ..., $pdb_21.pdb, i.e., to add padding zeros to the "file" variable.

I have tried without success using printf in different ways. Help would be much appreciated.

Upvotes: 28

Views: 62805

Answers (5)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2811

here's a VERY unconventional way of leveraging OFS to pad zeros :

jot 10 1 - 12333337 | 

mawk '(___ = __ - length($_)) <= _ || $++___ = $_ ($_=_)' OFS=0 __=23

00000000000000000000001
00000000000000012333338
00000000000000024666675
00000000000000037000012
00000000000000049333349
00000000000000061666686
00000000000000074000023
00000000000000086333360
00000000000000098666697
00000000000000111000034

They don't have to be zeros either. The same approach works just as fine padding emojis :

jot 10 1 - 12333337 | 

mawk2 '  (___ = __-length($_)) <=_ || 
         $++___ = $_ ($_ = _)' OFS='\360\237\246\201' __=17 |

gawk -e '$++NF = length($1)'

šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦1 17
šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦12333338 17
šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦24666675 17
šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦37000012 17
šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦49333349 17
šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦61666686 17
šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦74000023 17
šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦86333360 17
šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦98666697 17
šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦šŸ¦111000034 17

Upvotes: 0

ThomasMcLeod
ThomasMcLeod

Reputation: 7769

This does it without resort of printf, which is expensive. The first parameter is the string to pad, the second is the total length after padding.

echo 722 8 | awk '{ for(c = 0; c < $2; c++) s = s"0"; s = s$1; print substr(s, 1 + length(s) - $2); }'

If you know in advance the length of the result string, you can use a simplified version (say 8 is your limit):

echo 722 | awk '{ s = "00000000"$1; print substr(s, 1 + length(s) - 8); }'

The result in both cases is 00000722.

Upvotes: 3

James Brown
James Brown

Reputation: 37404

Here is a function that left or right-pads values with zeroes depending on the parameters: zeropad(value, count, direction)

function zeropad(s,c,d) {
    if(d!="r")             
        d="l"                # l is the default and fallback value
    return sprintf("%" (d=="l"? "0" c:"") "d" (d=="r"?"%0" c-length(s) "d":""), s,"")
}
{                            # test main
    print zeropad($1,$2,$3)
}

Some tests:

$ cat test
2 3 l
2 4 r
2 5
a 6 r

The test:

$ awk -f program.awk test
002
2000
00002
000000

It's not fully battlefield tested so strange parameters may yield strange results.

Upvotes: 1

JJ.
JJ.

Reputation: 5475

Here's how to create leading zeros with awk:

# echo 1 | awk '{ printf("%02d\n", $1) }'
01
# echo 21 | awk '{ printf("%02d\n", $1) }'
21

Replace %02 with the total number of digits you need (including zeros).

Upvotes: 44

glglgl
glglgl

Reputation: 91049

Replace file on output with sprintf("%02d", file).

Or even the whole assigment with filename = sprintf("%s_%02d.pdb", pdb, file);.

Upvotes: 37

Related Questions