Reputation: 83
I am running a for loop inside a bash file which will check some files (.ts) for a specific string and print the matching lines in a result file.
Here is the code:
#! /bin/bash
for file in *.ts;
do awk -f test_function.awk $file > result.txt;
done
And this is the test_function.awk
file:
match($0, /<name>(.*)<\/name>/,n){ nm=n[1] }
match($0, /<source>(.*)<\/source>/,s){ src=s[1] }
/unfinished/{ print "name: " nm, "source: " src }
And this is one of the input files that contains "unfinished" and needs to be included in the output:
<context>
<name>AccuCapacityApp</name>
<message>
<source>Capacity</source>
<translation type="unfinished">Kapazität</translation>
</message>
<message>
<source>Charge Level</source>
<translation type="unfinished"></translation>
</message>
<message>
<source>Sel (Yes)</source>
<translation type="unfinished">Sel (Ja)</translation>
</message>
<message>
<source>Esc (No)</source>
<translation type="unfinished">Esc (Nein)</translation>
</message>
</context>
It gives output like this:
name: AccuCapacityApp source: Capacity
name: AccuCapacityApp source: Charge Level
name: AccuCapacityApp source: Sel (Yes)
And this is one of the input files that doesn't contain "unfinished" and needs to be excluded from the output:
<context>
<name>ATM FSM state</name>
<message>
<source>Hunting</source>
<translation>Sync-Suche</translation>
</message>
<message>
<source>Pre-Sync</source>
<translation>Pre-Sync</translation>
</message>
<message>
<source>Sync</source>
<translation>Sync</translation>
</message>
</context>
What I want to do is to print the processing file name in the beginning of each paragrapgh of matching lines in the result file, ONLY when the matching strings are found, like following:
Processign file: alpha.txt
name: AccuCapacityApp source: Capacity
name: AccuCapacityApp source: Charge Level
name: AccuCapacityApp source: Sel (Yes)
Processing file: gamma.txt
name: AccuCapacityApp source: Capacity
name: AccuCapacityApp source: Charge Level
name: AccuCapacityApp source: Sel (Yes)
How can I achieve this?
I know the file name can be appended and then the matching lines can be appended to the result file. But I want to have a blank result file each time I run the bash file and only write the filename and content when the matching string is found. So I think appending the file name will not work. I have tried printing the file name with echo ${file##*/}
, echo $file
and {print FILENAME};{print "\t" $0}
but unable to print as desired.
Upvotes: 2
Views: 382
Reputation: 74615
Based on your update, I think this does what you want:
match($0, /<name>(.*)<\/name>/,m){ nm = m[1] }
match($0, /<source>(.*)<\/source>/,m){ src = m[1] }
/unfinished/ { list[++n] = src }
ENDFILE {
for (i = 1; i <= n; ++i) {
print "name:", nm, "source:", list[i]
}
n = 0
}
Only save elements when unfinished
is found, the loop through the list at the end of each file. n
keeps a count of the number of matches in the current file.
Use the script like this (no need for a shell loop):
awk -f test_function.awk *.ts > result.txt
Note that ENDFILE
is a GNU awk extension, but then so is the third argument to match
that you were already using, so I guess that's OK for you.
Upvotes: 1