Zubin
Zubin

Reputation: 9712

How to format text with multiple separators

I'd like to extract/format some text with awk.

The source text looks like this:

Section 1:
  main_command some_command1 some_subcommand1      # comment 1
  main_command some_command1 some_subcommand2      # comment 2

Section 2:
  main_command some_command2 some_subcommand3      # comment 3
  main_command some_command2 some_subcommand4      # comment 4

Section 3:
  main_command some_command3 some_subcommand5      # comment 5
  main_command some_command3 some_subcommand6      # comment 6

I want to know how to:

  1. filter to the indented lines under Section 2;
  2. specify which column I want (2 or 3); and
  3. extract the comments (after # ).

For example, if I chose column 2 the output would be:

some_command2<tab>'comment 3'
some_command2<tab>'comment 4'

I've used awk to achieve 1 and 2:

  awk -v RS='\n\n' '/^Section 2:/' "$path" | awk "/^  main_command/ {print $2}"

... but I suspect there's a better way to do it all without piping. Am open to using other tools (eg sed).

Upvotes: 1

Views: 46

Answers (3)

markp-fuso
markp-fuso

Reputation: 34324

One awk idea:

awk -v sec=1 -v col=3 '                                 # define section and column to process
/^Section/      { process= ($2 == sec":") ? 1 : 0
                  next
                }
process && NF>0 { split($0,arr,"#")
                  gsub(/^[[:space:]]/,"",arr[2])
                  print $(col) "\t\047" arr[2] "\047"
                }
' "${path}"

For sec=1 and col=3 this generates:

some_subcommand1        'comment 1'
some_subcommand2        'comment 2'

For sec=2 and col=2 this generates:

some_command2   'comment 3'
some_command2   'comment 4'

Upvotes: 1

anubhava
anubhava

Reputation: 785128

You may use this awk solution that works with any version of awk:

awk -v sq="'" -v OFS='\t' -v n=1 '
$1 == "Section" {
   p = ($2 == "2:")
   next
}
NF && p {
   s = $0
   sub(/^[^#]*#[[:blank:]]*/, "", s)
   print $1, sq s sq
}' file

blah7   'some comment 3...'
blah10  'some more comments 4...'

Using n=2 for printing column 2:

awk -v sq="'" -v OFS='\t' -v n=2 '$1 == "Section" {p = ($2 == "2:"); next} NF && p {split($0, a, /#[[:blank:]]*/); print $1, sq a[2] sq}' 

fileblah7   'some comment 3...'
blah10  'some more comments 4...'

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203483

$ cat tst.awk
BEGIN { OFS="\t" }
/^[^[:space:]]/ {
    this_sect = $0
    next
}
NF && (this_sect == sect) {
    val = $col
    sub(/[^#]*#[[:space:]]*/,"")
    print val, "\047" $0 "\047"
}

$ awk -v sect='Section 2:' -v col=2 -f tst.awk file
some_command2   'comment 3'
some_command2   'comment 4'

Upvotes: 1

Related Questions