Carlos Pinho
Carlos Pinho

Reputation: 15

How to isolate string from multiple files put them in one output file with file name as header in UNIX

I have some directory with multiple files with the extention .failed This files have the following format:

file1.failed:

FHEAD|4525|20170109000000|20170125024831
THEAD|150001021|20170109121206||
TDETL|4000785067||1|EA|||RETURN|||N
TTAIL|1
THEAD|150001022|20170109012801||
TDETL|4000804525||1|EA|||RETURN|||N
TTAIL|1
FTAIL|6

I need to extract all the text between THEAD| and |2 to a output file. im trying the following and it works only if i have only one file in the directory.

sed -n 's:.*THEAD|\(.*\)|2.*:\1:p' <*.failed >transactions.log

The output is:

transactions.log:

150001021
150001022

Now how can i do the same but for multiple files? Also it is possible to add the filename in the output file?

expected output:

file1.failed
150001021
150001022
file2.failed
150001023
150001024
150001025

Upvotes: 1

Views: 74

Answers (2)

potong
potong

Reputation: 58473

This might work for you (GNU sed):

sed -sn '1F;s/^THEAD|\([^|]*\)|.*/\1/p' file1 file2 file3 ...

Use the options -n and -s to invoke the grep-like nature and treat each files addresses separately. Display the current file name on the first line of the file only. Substitute and print the value between the required strings.

Upvotes: 0

James Brown
James Brown

Reputation: 37424

In awk:

$ awk -F\| 'FNR==1{print FILENAME} $1=="THEAD"{print $2}' foo foo
foo
150001021
150001022
foo
150001021
150001022

On the first record of each file it prints out the filename and after that it prints the second field on records that start with THEAD. Replace foo with all required files.

Upvotes: 1

Related Questions