mezda
mezda

Reputation: 3647

parsing using gawk : how to retain some variable value outside gawk

i am having a file f1.txt with contents as :

192.168.100.253:34611  69.171.228.46:80   5   2426   7    901      12      3327
192.168.100.253:34610  69.171.228.46:80   5   1068   6    626      11      1694
192.168.100.253:46808  69.171.224.24:80   4    470   5    563       9      1033

Then, i am using a gawk statement as follows :

gawk 'NR==1 {node1 = $1;node2 = $2}' f1.txt
echo "node" $node1

I am expecting node1 = 192.168.100.253:34611 and node2 = 69.171.228.46:80. And i want to use one more gawk statement with FS value as :, to further get the ip add and port no which i can use later in my script. But the values of node1 and node2 itself don't get stored. Are these like automatic var's in C ? How can i parse this so as to retain the node1 and node2 values ?

Any help will be greatly appreciated. Thanks in advance.

Upvotes: 0

Views: 223

Answers (3)

Jonathan Leffler
Jonathan Leffler

Reputation: 754650

The values are 'kept' in the gawk script, but since you don't print them or anything, you have a problem. The variables inside the gawk script are completely independent of any variables in the shell. When you run gawk, it is a separate process. You can pass shell variable values to gawk; you can't get the gawk variable values back to the shell by direct assignment.

There is a split() function in gawk that can be used to split each of node1 and node2 within the gawk script, and it places the split fields into an array indexed from 1, but what are you going to do with the values after that? You're pretty much obliged to print them:

array=($(gawk 'NR == 1 {split($1, node1, ":"); split($2, node2, ":");
                        print node1[1], node1[2], node2[1], node2[2]}' f1.txt))

Now you have a shell array:

echo ${array[*]}

From there, you can do as you wish in the shell script:

node1_ipv4=${array[0]}
node1_port=${array[1]}
node2_ipv4=${array[2]}
node2_port=${array[3]}

NB: This answer is explicitly for bash plus gawk; other shells or other variants of awk would probably require different answers.

Upvotes: 2

chepner
chepner

Reputation: 532003

# One call to gawk to put the two desired nodes into an array
nodes=( $(gawk 'NR==1 {print $1, $2}' f1.txt) )
# nodes=( 192.168.100.253:34611 69.171.228.46:80 )

# Use % to remove the :port suffix from each array element
addresses=( ${nodes[@]%:*} )
# addresses=( 192.168.100.253 69.171.228.46 )

# Use # to remove the address: prefix from each array element
ports=( ${nodes[@]#*:} )
# ports=( 34611 80 )

# Array subscripting
node1_addr=${addresses[0]};   # 192.168.100.253
node2_port=${ports[1]};       # 80

Upvotes: 2

mpe
mpe

Reputation: 1000

Run gawk on the file twice, parse out node1 and node2:

node1=$(gawk 'NR==1 {print $1}' f1.txt)
node2=$(gawk 'NR==1 {print $2}' f1.txt)

Then pry apart IP and port:

echo $node1 | gawk -F ':' '{printf("ip: %s port: %d\n", $1, $2)}'

The same for node2.

Upvotes: 2

Related Questions