Terry Purdon
Terry Purdon

Reputation: 61

Determining NR within BEGIN section of awk script

Before awk processes the input file, I need to know how many records to expect.

In order to determine this, I have the following code in the BEGIN segment of my awk script....

BEGIN {

    p = ""
    j = 1

    getline             # Activates the FILENAmE variable which normally is not available in the BEGIN section of an awk script.
    n = system("wc -l " FILENAME)       # Assign the result (i.e. number of records in FILENAME) to the n variable.
    gsub(FILENAME, "|", n)      # Remove the input file name appended to the result and replace with "|" just to see what it's done!  
    print n             # See what the hell has happened.

}

I am hoping to see n showing the number of records, but my output looks like this....

12 accounts12
0

"accounts12" is the name of my input file....

Upvotes: 3

Views: 745

Answers (4)

Ed Morton
Ed Morton

Reputation: 203712

The most efficient and concise way to do this given your input is always a file and not a stream is to just call wc outside of the script and use it's output inside it:

awk -v nr="$(wc -l < file)" '{print nr, NR, $0}' file

e.g.:

$ seq 3 > file
$ awk -v nr="$(wc -l < file)" '{print nr, NR, $0}' file
3 1 1
3 2 2
3 3 3

Upvotes: 0

ctac_
ctac_

Reputation: 2471

Another way with awk :

  • Set FS to \n (each line is a field)
  • Set RS to \0 (only one record)
  • Work on fields

    awk -F'\n' -vRS='\0' '
    {
    print NF
    for ( i = 1 ; i < NF ; i++ ) {
    j = split ( $i , a , " " )
    print "nb of fields = "j
    }
    }' infile

Upvotes: 0

karakfa
karakfa

Reputation: 67497

you can also do this

$ awk 'NR==FNR{n=NR; next} FNR==1{print n} ...' file{,}

first round it calculates the number of records, second round print the count and do the rest of the processing.

Upvotes: 1

jhnc
jhnc

Reputation: 16771

system returns its exit status (typically 0 if it completes successfully). So the line:

n = system("wc -l " FILENAME)

will simply result in the output of the wc command being printed on the screen as usual, and then n being set to the exit code 0.

This explains:

12 accounts12
0

The first line is the output of wc, the second the value of n.

You could try instead:

BEGIN {
    "wc -l " ARGV[1] | getline n;
    sub(ARGV[1], "|", n);
    print n;
}

This should get your n. It has the benefit that it won't consume the first line of your file.

Upvotes: 1

Related Questions