lllllllllllll
lllllllllllll

Reputation: 9110

How to use `awk` to grep certain columns like these?

So basically I have some text like this:

[ 4] .init             PROGBITS        080481c0 0001c0 00002e 00  AX  0   0  4
[ 5] .plt              PROGBITS        080481f0 0001f0 000110 00  AX  0   0 16
[ 6] .text             PROGBITS        08048300 000300 07c95c 00  AX  0   0 16
[ 7] __libc_thread_fre PROGBITS        080c4c60 07cc60 000076 00  AX  0   0 16
[ 8] __libc_freeres_fn PROGBITS        080c4ce0 07cce0 000b2f 00  AX  0   0 16
[ 9] .fini             PROGBITS        080c5810 07d810 00001a 00  AX  0   0  4
[10] .rodata           PROGBITS        080c5840 07d840 019774 00   A  0   0 32
[11] __libc_thread_sub PROGBITS        080defb4 096fb4 000004 00   A  0   0  4
[12] __libc_subfreeres PROGBITS        080defb8 096fb8 00002c 00   A  0   0  4
[13] __libc_atexit     PROGBITS        080defe4 096fe4 000004 00   A  0   0  4

What I am trying to get is this:

.init                    080481c0 0001c0 00002e 
.plt                     080481f0 0001f0 000110 
.text                    08048300 000300 07c95c 
__libc_thread_fre        080c4c60 07cc60 000076 
__libc_freeres_fn        080c4ce0 07cce0 000b2f  
.fini                    080c5810 07d810 00001a 
.rodata                  080c5840 07d840 019774 
__libc_thread_sub        080defb4 096fb4 000004 
__libc_subfreeres        080defb8 096fb8 00002c  
__libc_atexit            080defe4 096fe4 000004 

I tried something like this:

 awk '/PROGBITS/ {print $2,$4,$5,$6} '

but the problem is that, there is a space inside [ 4] .. , which means in the 4-9 line, I have to use

awk '/PROGBITS/ {print $3,$5,$6,$7} '

Is there anyway to use a single command while getting all the columns I want..?

Upvotes: 4

Views: 169

Answers (8)

gbrener
gbrener

Reputation: 5835

You can add a field-separator option with -F:

awk -F'^\\[ *[0-9]+\\] | +' '{printf "%-24s %-8s %-6s %-6s\n", $2, $4, $5, $6}' file

The regular expression passed as the field-separator takes care of the possibility of numerical/spacial ambiguity in the beginning of each line.

Upvotes: 2

jrjc
jrjc

Reputation: 21873

You can also try:

awk '/PROGBITS/{print $(NF-9),$(NF-7),$(NF-6),$(NF-5)}' file

If you want to keep something readable, by choosing the columns width:

awk '/PROGBITS/{printf "%-18s %-10s %-10s %-10s\n", $(NF-9),$(NF-7),$(NF-6),$(NF-5)}' file

It is also not impossible that your file has \t (tabs) as field separators; if so, you may try:

awk -F"\t" '{print $2,$4,$5,$6}' file

Hope this helps.

Upvotes: 3

mklement0
mklement0

Reputation: 437121

A sed solution (GNU sed and FreeBSD/OS X sed) - tip of the hat to @Tiago's helpful Perl solution:

sed -E 's/^.*\] (.*)PROGBITS( +[^ ]+)( +[^ ]+)( +[^ ]+).*$/\1 \2 \3 \4/' file
  • Uses a regular expression that matches the entire line, with capture groups ((...)) matching the data of interest (including preceding whitespace), and then replacing the line with only the data of interest - \1 refers to the 1st capture group's match, \2 to the 2nd, ...

Note that it can be done in a POSIX-compliant manner, but it gets ugly:

sed 's/^.*\] \(.*\)PROGBITS\( \{1,\}[^ ]\{1,\}\)\( \{1,\}[^ ]\{1,\}\)\( \{1,\}[^ ]\{1,\}\).*$/\1 \2 \3 \4/' file

Upvotes: 1

mklement0
mklement0

Reputation: 437121

If all you need is to extract the columns as specified, cut will do:

cut -c 6-22 -c 32-62 file

Upvotes: 3

Jotne
Jotne

Reputation: 41446

With gnu awk you have this elegant way to handle text with fixed width on fields. It will also keep the formatting.

awk -v FIELDWIDTHS="5 18 16 8 7 8" '{print $2,$4,$5,$6}' file
.init              080481c0  0001c0  00002e
.plt               080481f0  0001f0  000110
.text              08048300  000300  07c95c
__libc_thread_fre  080c4c60  07cc60  000076
__libc_freeres_fn  080c4ce0  07cce0  000b2f
.fini              080c5810  07d810  00001a
.rodata            080c5840  07d840  019774
__libc_thread_sub  080defb4  096fb4  000004
__libc_subfreeres  080defb8  096fb8  00002c
__libc_atexit      080defe4  096fe4  000004

Upvotes: 3

Tiago Lopo
Tiago Lopo

Reputation: 7959

If you can use perl:

perl -lne '/\] \K(.*)PROGBITS\s+(\w+)\s+(\w+)\s+(\w+)/ && print "$1 $2 $3 $4" '

In action:

perl -lne '/\] \K(.*)PROGBITS\s+(\w+)\s+(\w+)\s+(\w+)/ && print "$1 $2 $3 $4" ' file
.init              080481c0 0001c0 00002e
.plt               080481f0 0001f0 000110
.text              08048300 000300 07c95c
__libc_thread_fre  080c4c60 07cc60 000076
__libc_freeres_fn  080c4ce0 07cce0 000b2f
.fini              080c5810 07d810 00001a
.rodata            080c5840 07d840 019774
__libc_thread_sub  080defb4 096fb4 000004
__libc_subfreeres  080defb8 096fb8 00002c
__libc_atexit      080defe4 096fe4 000004

Upvotes: 2

Maroun
Maroun

Reputation: 95948

You can simply remove any whitespace right after the [:

sed 's_\[\s_[_'

Try,

echo '[ 1]' | sed 's_\[\s_[_'

It'll print [1].

Upvotes: 1

ooga
ooga

Reputation: 15501

Try this:

awk '/PROGBITS/ {if (NF==12) print $3,$5,$6,$7; else print $2,$4,$5,$6}'

Upvotes: 0

Related Questions