theuniverseisflat
theuniverseisflat

Reputation: 881

Advanced AWK formatting

I am having problem using this awk command . It is not producing the result I want giving this input file. Can someone help me with this please?

I am searching for "Class:" value of "ABC". When I find ABC . I like to assign the values associated with userName/servicelist/hostlist and port number to variables. ( please see output section ) to

   awk -v q="\"" '/ABC/{f=1;c++} 
   f && /userName|serviceList|hostList|portNumber/
                                 {sub(":",c"=",$1); 
                                 print $1 q $3 q
                                } 
                          /port:/{f=0;print ""}' filename

The file contains the following input

                      Instance: Ths is a test 
                           Class: ABC
Variables:
          udpRecvBufSize:          Numeric:          8190000
                userName:           String:test1
            pingInterval:          Numeric:                2
      blockedServiceList:           String:
       acceptAllServices:          Boolean:            False
             serviceList:           String:          ABC
                hostList:           String:    159.220.108.3
                protocol:           String:             JJJJ
             portNumber:          Numeric:            20001
                    port:           String:         RTR_LLLL
                             Children:


                          Instance: The First Server in the Loop 
                              Class: Servers
Variables:
                 pendout:          Numeric:                0
               overflows:          Counter:                0
         peakBufferUsage:          Numeric:              100
        bufferPercentage:            Gauge:                1 (0,100)
      currentBufferUsage:          Numeric:                1
         pendingBytesOut:          Numeric:                0
          pendingBytesIn:          Numeric:                1
           pingsReceived:          Counter:            13597
               pingsSent:          Counter:            87350
     clientToServerPings:          Boolean:             True
     serverToClientPings:          Boolean:             True
         numInputBuffers:          Numeric:               10
        maxOutputBuffers:          Numeric:              100
 guaranteedOutputBuffers:          Numeric:              100
      lastOutageDuration:           String:       0:00:00:00
      peakDisconnectTime:           String:
     totalDisconnectTime:           String:       0:00:00:00
          disconnectTime:           String:
       disconnectChannel:          Boolean:            False
      enableDacsPermTest:          Boolean:            False
          enableFirewall:          Boolean:            False
          dacsPermDenied:          Counter:                0
              dacsDomain:           String:
      compressPercentage:            Gauge:                0 (0,100)
     uncompBytesSentRate:            Gauge:                0    (0,9223372036854775807)





                           Instance: Ths is a test 
                               Class: ABC
Variables:
          udpRecvBufSize:          Numeric:          8190000
                userName:           String:test2
            pingInterval:          Numeric:                4
      blockedServiceList:           String:
       acceptAllServices:          Boolean:            False
             serviceList:           String:          DEF
                hostList:           String:    159.220.111.2
                protocol:           String:             ffff
              portNumber:          Numeric:            20004
                    port:           String:         JJJ_LLLL
                             Children:

This is the output I am looking for . Assigning variables

userName1="test1"
serviceList1="ABC"
hostList1="159.220.108.3"
portNumber1="2001"

userName2="test2"
serviceList2="DEF"
hostList2="159.220.111.2"
portNumber2="2004"

Upvotes: 0

Views: 168

Answers (4)

Kaz
Kaz

Reputation: 58666

Solution in TXR:

@(collect)
@(skip)Class: ABC
Variables:
@  (gather)
 userName: String:@user
 serviceList: String: @servicelist
 hostList: String: @hostlist
 portNumber: Numeric: @port
@  (until)
 Children:
@  (end)
@(end)
@(deffilter shell-esc
   ("\"" "\\\"") ("$" "\\$") ("`" "\\'")
   ("\\" "\\\\"))
@(output :filter shell-esc)
@  (repeat :counter i)
userName@(succ i)="@user"
serviceList@(succ i)="@servicelist"
hostList@(succ i)="@hostlist"
portNumber@(succ i)="@port"
@  (end)
@(end)

Run:

$ txr data.txr data
userName1="test1"
serviceList1="ABC"
hostList1="159.220.108.3"
portNumber1="20001"
userName2="test2"
serviceList2="DEF"
hostList2="159.220.111.2"
portNumber2="20004"

Note 1: Escaping is necessary if the data may contain characters which are special between quotes in the target language. The shell-esc filter is based on the assumption that the generated variable assignments are shell syntax. It can easily be replaced.

Note 2: The code assumes that each Class: ABC has all of the required variables present. It will not work right if some are missing, and there are two ways to address it by tweaking the @(gather) line:

  • failure:

    @(gather :vars (user servicelist hostlist port))
    

    Meaning: fail if any of these four variables are not gathered. The consequence is that the entire Class: ABC section with missing variables is skipped.

  • default missing:

    @(gather :vars (user (servicelist "ABC") hostlist port))
    

    Meaning: must gather the four variables user, servicelist, hostlist and port. However, if serviceList is missing, then it gets the default value "ABC" and is treated as if it had been found.

Upvotes: 0

karakfa
karakfa

Reputation: 67567

$ awk -F: -v q="\"" '/Class: ABC/{f=1;c++;print ""}  \
     f && /userName|serviceList|hostList|portNumber/ \
         {gsub(/ /,"",$1);  \
          gsub(/ /,"",$3);  \
          print $1 c "=" q $3 q} \
        /Children:/{f=0}' vars

userName1="test1"
serviceList1="ABC"
hostList1="159.220.108.3"
portNumber1="20001"

userName2="test2"
serviceList2="DEF"
hostList2="159.220.111.2"
portNumber2="20004"

it will increment the counter for each "Class: ABC" pattern and set a flag. Will format and print the selected entries until the terminal pattern for the block. This limits the context between the two patterns.

Upvotes: 0

Charles Duffy
Charles Duffy

Reputation: 296049

Assuming bash 4.0 or newer, there's no need for awk here at all:

flush() {
  if (( ${#hostvars[@]} )); then
    for varname in userName serviceList hostList portNumber; do
      [[ ${hostvars[$varname]} ]] && {
        printf '%q=%q\n' "$varname" "${hostvars[$varname]}"
      }
    done
    printf '\n'
  fi
  hostvars=( )
}

class=
declare -A hostvars=( )
while read -r line; do
  [[ $line = *"Class: "* ]] && class=${line#*"Class: "} 
  [[ $class = ABC ]] || continue
  case $line in
    *:*:*)
      IFS=$': \t' read varName varType value <<<"$line"
      hostvars[$varName]=$value
      ;;
    *"Variables:"*)
      flush
      ;;
  esac
done
flush

Notable points:

  • The full set of defined variables are collected in the hostvars associative array (what other languages might call a "map" or "hash"), even though we're only printing the four names defined to be of interest. More interesting logic could thus be defined that combined multiple variables to decide what to output, &c.
  • The flush function is defined outside the loop so it can be used in multiple places -- both when starting a new block (as detected, here, by seeing Variables:), and when at the end-of-file.
  • The output varies from what you requested in that it includes quotes only if necessary -- but that quoting is guaranteed to be correct and sufficient for bash to parse without room for security holes even if the strings being emitted would otherwise contain security-relevant content. Think about correctly handling a case where serviceList contains $(rm -rf /*)'$(rm -rf /*)' (the duplication being present to escape single quotes); printf %q makes this easy, whereas awk has no equivalent.

Upvotes: 0

Tom Fenech
Tom Fenech

Reputation: 74705

If your intention is to assign to a series of variables, then rather than parsing the whole file at once, perhaps you could just extract the specific parts that you're interested in one by one. For example:

$ awk -F'\n' -v RS= -v record=1 -v var=userName 'NR == record { for (i=1; i<=NF; ++i) if (sub("^\\s*" var ".*:\\s*", "", $i)) print $i }' file
test1
$ awk -F'\n' -v RS= -v record=1 -v var=serviceList 'NR == record { for (i=1; i<=NF; ++i) if (sub("^\\s*" var ".*:\\s*", "", $i)) print $i }' file
ABC

The awk script could be put inside a shell function and used like this:

parse_file() {
    record=$1
    var=$2
    file=$3

    awk -F'\n' -v RS= -v record="$record" -v var="$var" 'NR == record {
        for (i=1; i<=NF; ++i) if (sub("^\\s*" var ".*:\\s*", "", $i)) print $i 
    }' "$file"
}

userName1=$(parse_file 1 userName file)
serviceList1=$(parse_file 1 serviceList file)
# etc.

Upvotes: 2

Related Questions