Reputation: 651
So my goal is to extract the fifth line of every file in my directory. I have a bunch of extension (*.gjf) files in my directory, and on the fifth line is always "1 0" or "1 1" without the quotes.
So far I know that I can extract those values but not with the filenames attached to them. This is the code I've been using.
awk 'FNR == 5' *.gjf
1 1
0 1
0 1
1 1
1 1
0 1
I desire my parsed files to look like this specifically.
FILNAME: 1AH7A_TRP-16-A_GLU-9-A.gjf, 1, 1,
FILNAME: 1AH7A_TRP-198-A_ASP-197-A.gjf, 1 , 1,
FILNAME: 1BGFA_TRP-43-A_GLU-44-A.gjf, 0, 1,
FILNAME: CXQA_TRP-61-A_ASP-82-A.gjf, 1, 1,
I'd like the filenames to precede these values because I want to run statistics on these files as comma separated value files in R (and I am very capable to do that), and it's very important to me that I can prove that there are only two patterns in my files, the patterns being ordered "0 1" and "1 0".
I even tried listing the files
I tried doing this:
grep -l "" *.gjf | awk 'FNR == 5' *.gjf
since I knew that I could grep the existence of the files and that would print the list to the screen. But I think I just passed it to awk, and so it computed.
1 1
1 1
0 1
1 1
etc ...
I think that it just passed the files to awk and so it printed the nth line. I tried using && instead of |, and it just printed a complete list of the files and then a complete list of the numbers in no organized fashion. Clearly I don't know how to do this.
Upvotes: 2
Views: 1073
Reputation: 246754
With GNU awk
gawk -v OFS=", " 'FNR == 5 {print "FILENAME: " FILENAME, $1, $2; nextfile}' *.gjf
Yes, FILENAME
is the awk variable containing the current filename being processed.
Upvotes: 4
Reputation: 42999
Use this loop:
for file in *.gjf; do
echo "FILENAME: $file, " $(sed 's/ /,/;s/$/,/;5q;d' "$file")
done
sed '5q;d'
extracts the 5th lineUpvotes: 1