pelorus32
pelorus32

Reputation: 23

AWK print based on FILENAME pattern

I have a directory of files with filenames of the form file000.txt to filennn.txt. I would like to be able to specify a range of file names and print the content of those files based on a match. I have achieved it with a single file pattern:

$ gawk 'FILENAME ~/file038.txt/ {print FILENAME, $0}' file*.txt
file038.txt Some 038 text here

But I cannot get a pattern that would allow me to specify a range of file names, for instance

gawk 'FILENAME ~/file[038-040].txt/ {print FILENAME, $0}' file*.txt

I'm sure I'm missing something simple here, I'm an AWK newbie. Any suggestions?

Upvotes: 2

Views: 3722

Answers (4)

user3442743
user3442743

Reputation:

Should work

awk '(x=FILENAME)~/(3[8-9]|40).txt$/{print x,$0;quit}' file*.txt

As quit doesn't work(atleast with my version of awk) here is another way

awk 'FNR==((x=FILENAME)~/(3[8-9]|40).txt$/){print x,$0}' file*.txt

Upvotes: 0

John1024
John1024

Reputation: 113844

Solution using gawk and a recent version of bash

There is a bash primitive to handle file[038-040].txt. It makes the code quite simple:

gawk 'FNR==1 {print FILENAME, $0} {quit}' file{038..040}.txt

Key points:

  • FNR==1 {print FILENAME, $0}

    This prints the filename and the first line of each file

  • {quit}

    This saves time by skipping directly to the next file.

  • file{038..040}.txt

    The construct {038..040} is a bash feature called brace expansion. bash will replace this with the file names that you want. If you want to test out brace expansion to see how it works, try it on the command line with this simple statement:

    echo file{038..040}.txt
    

UPDATE 1: Mac OSX currently uses bash v3.2 which does not support leading zeros in brace expansion.

UPDATE 2: If there are missing files and you have a modern gawk (v4.0 or better), use this instead:

gawk 'BEGINFILE{ if (ERRNO) nextfile} FNR==1 {print FILENAME, $0} {quit}' file{038..040}.txt

Solution using gawk with a plain POSIX shell

gawk '{n=0+substr(FILENAME,5,3)} FNR==1 && n>=38 && n<=40 {print FILENAME, $0} {quit}' file*.txt

Explanation:

  • n=0+substr(FILENAME,5,3)

    Extract the number from the filename. 0+ is a trick to force awk to treat n as numeric.

  • n>=38 && n<=40 {print FILENAME, $0}

    This selects the file based on its number and prints the filename and first line.

  • {quit}

    As before, this saves time by stopping awk from reading the rest of each file.

  • file*.txt

    This can be expanded by any POSIX shell to the list of file names.

Upvotes: 0

SMA
SMA

Reputation: 37023

Odd way but something on these lines:

awk '{ if (match(FILENAME,/file0[3-4][0-8].txt/)) { print FILENAME, $0}}' file*.txt

Upvotes: 0

Kent
Kent

Reputation: 195059

you can do some substitution on the filename, for example:

awk '{x=FILENAME;gsub(/[^0-9]/,"",x);x+=0}x>10&&x<50{your logic}' file*.txt

in this way, file file011.txt ~ file049.txt would be handled with "your logic"

You can adjust the part: x>10&&x<50 for example, handle only file with the number in the name as odd/even/.... just write boolean expressions there.

Upvotes: 2

Related Questions