Reputation: 15
I have some directory with multiple files with the extention .failed This files have the following format:
file1.failed:
FHEAD|4525|20170109000000|20170125024831
THEAD|150001021|20170109121206||
TDETL|4000785067||1|EA|||RETURN|||N
TTAIL|1
THEAD|150001022|20170109012801||
TDETL|4000804525||1|EA|||RETURN|||N
TTAIL|1
FTAIL|6
I need to extract all the text between THEAD| and |2 to a output file. im trying the following and it works only if i have only one file in the directory.
sed -n 's:.*THEAD|\(.*\)|2.*:\1:p' <*.failed >transactions.log
The output is:
transactions.log:
150001021
150001022
Now how can i do the same but for multiple files? Also it is possible to add the filename in the output file?
expected output:
file1.failed
150001021
150001022
file2.failed
150001023
150001024
150001025
Upvotes: 1
Views: 74
Reputation: 58473
This might work for you (GNU sed):
sed -sn '1F;s/^THEAD|\([^|]*\)|.*/\1/p' file1 file2 file3 ...
Use the options -n
and -s
to invoke the grep-like nature and treat each files addresses separately. Display the current file name on the first line of the file only. Substitute and print the value between the required strings.
Upvotes: 0
Reputation: 37424
In awk:
$ awk -F\| 'FNR==1{print FILENAME} $1=="THEAD"{print $2}' foo foo
foo
150001021
150001022
foo
150001021
150001022
On the first record of each file it prints out the filename and after that it prints the second field on records that start with THEAD
. Replace foo
with all required files.
Upvotes: 1