Awk to read file as a whole

Question

Let a file with content as under -

abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq

In general if any operation using awk is performed, it iterates line by line and performs that action on each line.

For e.g:

awk '{print substr($0,8,10)}' file

O/P:

hijklmn
wxyzabc
klmnopq

I would like to know an approach in which all the contents inside the file is treated as a single variable and awk prints just one output.

Example Desired O/P:

hijklmnpqr

It's not that I wish for the desired output for the given question but in general would appreciate if anyone could suggest an approach to provide the content of a file as a whole to the awk.

Juan Diego Godoy Robles · Accepted Answer

This is a `gawk` solution

From the docs:

There are times when you might want to treat an entire data file as a single record. The only way to make this happen is to give RS a value that you know doesn’t occur in the input file. This is hard to do in a general way, such that a program always works for arbitrary input files.

$ cat file
abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq

The RS must be set to a pattern not present in archive, following Denis Shirokov suggestion on the docs (Thanks @EdMorton):

$ gawk '{print ">>>"$0"<<<<"}' RS='^$' file
>>>abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq

abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq
<<<<

The trick is in bold font:

It works by setting RS to ^$, a regular expression that will never match if the file has contents. gawk reads data from the file into tmp, attempting to match RS. The match fails after each read, but fails quickly, such that gawk fills tmp with the entire contents of the file

So:

$ gawk '{gsub(/
/,"");print substr($0,8,10)}' RS='^$' file

Returns:

hijklmnpqr

Awk to read file as a whole

Answers (2)

This is a `gawk` solution

Related Questions

Awk to read file as a whole

Answers (2)

This is a gawk solution

Related Questions

This is a `gawk` solution