Reputation: 692
I have this odd condition where I've been given a series of HEX values that represent binary data. The interesting thing is that they are occasionally different lengths, such as:
40000001AA
0000000100
A0000001
000001
20000001B0
40040001B0
I would like to append 0's on the end to make them all the same length based on the longest entry. So, in the example above I have four entires that are 10 characters long, terminated by '\n', and a few short ones (in the actual data, I 200k of entries with about 1k short ones). What I would like to do figure out the longest string in the file, and then go through and pad the short ones; however, I haven't been able to figure it out. Any suggestions would be appreciated.
Upvotes: 1
Views: 303
Reputation: 203324
In general to zero-pad a string from either or both sides is (using 5
as the desired field width for example):
$ echo '17' | awk '{printf "%0*s\n", 5, $0}'
00017
$ echo '17' | awk '{printf "%s%0*s\n", $0, 5-length(), ""}'
17000
$ echo '17' | awk '{w=int((5+length())/2); printf "%0*s%0*s\n", w, $0, 5-w, ""}'
01700
$ echo '17' | awk '{w=int((5+length()+1)/2); printf "%0*s%0*s\n", w, $0, 5-w, ""}'
00170
so for your example:
$ awk '{cur=length()} NR==FNR{max=(cur>max?cur:max);next} {printf "%s%0*s\n", $0, max-cur, ""}' file file
40000001AA
0000000100
A000000100
0000010000
20000001B0
40040001B0
Upvotes: 1
Reputation: 785058
Using standard two-pass awk:
awk 'NR==FNR{if (len < length()) len=length(); next}
{s = sprintf("%-*s", len, $0); gsub(/ /, "0", s); print s}' file file
40000001AA
0000000100
A000000100
0000010000
20000001B0
40040001B0
Or using gnu wc
with awk
:
awk -v len="$(wc -L < file)" '
{s = sprintf("%-*s", len, $0); gsub(/ /, "0", s); print s}' file
40000001AA
0000000100
A000000100
0000010000
20000001B0
40040001B0
Upvotes: 3
Reputation: 12383
As you use Bash
there is a big chance that you also use other GNU
tools. In such case wc
can easily tell you the the length of the
greatest line in the file using -L
option. Example:
$ wc -L /tmp/HEX
10 /tmp/HEX
Padding can be done like this:
$ while read i; do echo $(echo "$i"0000000000 | head -c 10); done < /tmp/HEX
40000001AA
0000000100
A000000100
0000010000
20000001B0
40040001B0
A one-liner:
while read i; do eval printf "$i%.s0" {1..$(wc -L /tmp/HEX | cut -d ' ' -f1)} | head -c $(wc -L /tmp/HEX | cut -d ' ' -f1); echo; done < /tmp/HEX
Upvotes: 2
Reputation: 650
Let's suppose you have this values in file:
file=/tmp/hex.txt
Find out length of longest number:
longest=$(wc -L < $file)
Now for each number in file justify it with zeroes
while read number; do
printf "%-${longest}s\n" $number | sed 's/ /0/g'
done < $file
This what will print script to stdout:
40000001AA
0000000100
A000000100
0000010000
20000001B0
40040001B0
Upvotes: 1