Reputation: 190
Hi All can you please help me I am trying to get the following output so basically I have 2 input files and we need only the common :names from both the input files along with there the lines below them the .name/of/file lines.
Till now I have tried:
awk 'FNR==NR { a[$1]; next }NF<=1 { flag=0 } $1 in a { flag=1 }flag' file1 file2
Output:
:name1
./name/of/file [logfile] [ error in file coming since Day : 1 ]
./name/of/file [logfile] [ error in file coming since Day : 1 ]
:name3
./name/of/file [logfile] [ error in file coming since Day : 24 ]
./name/of/file [logfile] [ error in file coming since Day : 24 ]
:name1
./name/of/file [logfile] [ error in file coming since Day : 40]
./name/of/file [logfile] [ error in file coming since Day : 40 ]
:name4
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
:name4
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]
But its printing all the occrance of :name1 and :name4 we only need the first occurrence of :name1 and :name4 from input file 1
Input file1:
:name1
./name/of/file [logfile] [ error in file coming since Day : 1 ]
./name/of/file [logfile] [ error in file coming since Day : 1 ]
:name2
./name/of/file [logfile] [ error in file coming since Day : 1 ]
:name3
./name/of/file [logfile] [ error in file coming since Day : 24 ]
./name/of/file [logfile] [ error in file coming since Day : 24 ]
:name1
./name/of/file [logfile] [ error in file coming since Day : 40]
./name/of/file [logfile] [ error in file coming since Day : 40 ]
:name4
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
:name5
./name/of/file [logfile] [ error in file coming since Day : 6 ]
./name/of/file [logfile] [ error in file coming since Day : 6 ]
:name4
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]
./name/of/file [logfile] [ error in file coming since Day : 10 ]
Input file2:
:name1
:name3
:name4
Required Outputfile:
:name1
./name/of/file [logfile] [ error in file coming since Day : 1 ]
./name/of/file [logfile] [ error in file coming since Day : 1 ]
:name3
./name/of/file [logfile] [ error in file coming since Day : 24 ]
./name/of/file [logfile] [ error in file coming since Day : 24 ]
:name4
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
Upvotes: 1
Views: 227
Reputation: 34054
Since an 'occurrence' is based on the name existing as an index in the a[]
array, removal of that entry from the array should disable any further 'occurrences'.
We can accomplish this removal by adding a delete
to the current awk
code:
awk '
FNR==NR { a[$1]; next }
NF<=1 { flag=0 } # clear flag if zero or one non-white-space fields is present in the current line
$1 in a { flag=1; delete a[$1] } # set flag if 1st field is an index in the a[] array; delete entry from array to insure no more occurrences of this name ($1)
flag # if flag == 1 then print current line to stdout
' file2 file1
This generates:
:name1
./name/of/file [logfile] [ error in file coming since Day : 1 ]
./name/of/file [logfile] [ error in file coming since Day : 1 ]
:name3
./name/of/file [logfile] [ error in file coming since Day : 24 ]
./name/of/file [logfile] [ error in file coming since Day : 24 ]
:name4
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
If the logfile lines need to be indented (eg, 4 spaces as in the question), one idea:
awk '
FNR==NR { a[$1]; next }
NF<=1 { flag=0 }
$1 in a { print; flag=1; delete a[$1]; next }
flag { printf " %s\n",$0 }
' file2 file1
This generates:
:name1
./name/of/file [logfile] [ error in file coming since Day : 1 ]
./name/of/file [logfile] [ error in file coming since Day : 1 ]
:name3
./name/of/file [logfile] [ error in file coming since Day : 24 ]
./name/of/file [logfile] [ error in file coming since Day : 24 ]
:name4
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
./name/of/file [logfile] [ error in file coming since Day : 3 ]
Upvotes: 1