John
John

Reputation: 83

Print processing file name in FOR loop inside the result file - BASH

I am running a for loop inside a bash file which will check some files (.ts) for a specific string and print the matching lines in a result file.

Here is the code:

#! /bin/bash

for file in *.ts;
do awk -f test_function.awk $file > result.txt;
done

And this is the test_function.awk file:

match($0, /<name>(.*)<\/name>/,n){ nm=n[1] }
match($0, /<source>(.*)<\/source>/,s){ src=s[1] }
/unfinished/{ print "name: " nm, "source: " src }

And this is one of the input files that contains "unfinished" and needs to be included in the output:

<context>
    <name>AccuCapacityApp</name>
    <message>
        <source>Capacity</source>
        <translation type="unfinished">Kapazität</translation>
    </message>
    <message>
        <source>Charge Level</source>
        <translation type="unfinished"></translation>
    </message>
    <message>
        <source>Sel (Yes)</source>
        <translation type="unfinished">Sel (Ja)</translation>
    </message>
    <message>
        <source>Esc (No)</source>
        <translation type="unfinished">Esc (Nein)</translation>
    </message>
</context>

It gives output like this:

name: AccuCapacityApp source: Capacity
name: AccuCapacityApp source: Charge Level
name: AccuCapacityApp source: Sel (Yes)

And this is one of the input files that doesn't contain "unfinished" and needs to be excluded from the output:

<context>
    <name>ATM FSM state</name>
    <message>
        <source>Hunting</source>
        <translation>Sync-Suche</translation>
    </message>
    <message>
        <source>Pre-Sync</source>
        <translation>Pre-Sync</translation>
    </message>
    <message>
        <source>Sync</source>
        <translation>Sync</translation>
    </message>
</context>

What I want to do is to print the processing file name in the beginning of each paragrapgh of matching lines in the result file, ONLY when the matching strings are found, like following:

Processign file: alpha.txt
name: AccuCapacityApp source: Capacity
name: AccuCapacityApp source: Charge Level
name: AccuCapacityApp source: Sel (Yes)

Processing file: gamma.txt
name: AccuCapacityApp source: Capacity
name: AccuCapacityApp source: Charge Level
name: AccuCapacityApp source: Sel (Yes)

How can I achieve this?

I know the file name can be appended and then the matching lines can be appended to the result file. But I want to have a blank result file each time I run the bash file and only write the filename and content when the matching string is found. So I think appending the file name will not work. I have tried printing the file name with echo ${file##*/}, echo $file and {print FILENAME};{print "\t" $0} but unable to print as desired.

Upvotes: 2

Views: 382

Answers (1)

Tom Fenech
Tom Fenech

Reputation: 74615

Based on your update, I think this does what you want:

match($0, /<name>(.*)<\/name>/,m){ nm = m[1] }
match($0, /<source>(.*)<\/source>/,m){ src = m[1] }
/unfinished/ { list[++n] = src }
ENDFILE {
    for (i = 1; i <= n; ++i) {
        print "name:", nm, "source:", list[i]
    }
    n = 0
}

Only save elements when unfinished is found, the loop through the list at the end of each file. n keeps a count of the number of matches in the current file.

Use the script like this (no need for a shell loop):

awk -f test_function.awk *.ts > result.txt

Note that ENDFILE is a GNU awk extension, but then so is the third argument to match that you were already using, so I guess that's OK for you.

Upvotes: 1

Related Questions