Reputation: 690
I have an output
file containing thousands of lines of information. Every so often I find in the output file information of the following form¨
Input Orientation:
...
content
...
Distance matrix (angstroms):
I now want to print the content
and save to filename
. However, the above occurs at several places in the output file, and I only want the last entry in the output file. Here's what I've tried so far
tac output | sed -n -e '/Distance matrix/,/Input orientation/p' > filename
However, this prints prints all instances of the matched pattern to filename
.
Then I read that with GNU sed, of which I have version 4.2.1 installed, the following should work:
tac output | sed -n -e '0,/Distance matrix/,/Input orientation/p' > filename
But this gives me an error:
sed: -e expression #1, char 20: unknown command: `,'
Then I tried to ask sed to quit after matching pattern Input orientation
:
tac output | sed -n -e '/Distance matrix/,/Input orientation/{p;q}' > filename
But now it ends up only printing Distance matrix (angstroms):
to filename
I'm sure it if possible, I'm just not able figure it out! I have no experience with awk, so I would prefer answers using sed.
Sample output file for testing:
Item Value Threshold Converged?
Maximum Force 0.005032 0.000450 NO
RMS Force 0.001066 0.000300 NO
Maximum Displacement 0.027438 0.001800 NO
RMS Displacement 0.007282 0.001200 NO
Predicted change in Energy=-8.909077D-05
GradGradGradGradGradGradGradGradGradGradGradGradGradGradGradGradGradGrad
Input orientation:
---------------------------------------------------------------------
Center Atomic Atomic Coordinates (Angstroms)
Number Number Type X Y Z
---------------------------------------------------------------------
1 6 0 Incorrect Incorrect Incorrect
2 1 0 Incorrect Incorrect Incorrect
3 1 0 Incorrect Incorrect Incorrect
4 1 0 Incorrect Incorrect Incorrect
5 17 0 Incorrect Incorrect Incorrect
6 9 0 Incorrect Incorrect Incorrect
---------------------------------------------------------------------
Distance matrix (angstroms):
1 2 3 4 5
1 C 0.000000
2 H 1.080163 0.000000
3 H 1.080326 1.809416 0.000000
4 H 1.080621 1.810236 1.810685 0.000000
5 Cl 1.962171 2.470702 2.468769 2.465270 0.000000
6 F 2.390537 2.343910 2.357275 2.380515 4.352568
6
6 F 0.000000
Input orientation:
---------------------------------------------------------------------
Center Atomic Atomic Coordinates (Angstroms)
Number Number Type X Y Z
---------------------------------------------------------------------
1 6 0 Correct Correct Correct
2 1 0 Correct Correct Correct
3 1 0 Correct Correct Correct
4 1 0 Correct Correct Correct
5 17 0 Correct Correct Correct
6 9 0 Correct Correct Correct
---------------------------------------------------------------------
Distance matrix (angstroms):
1 2 3 4 5
1 C 0.000000
2 H 1.080516 0.000000
3 H 1.080587 1.801890 0.000000
4 H 1.080473 1.801427 1.801478 0.000000
5 Cl 1.936014 2.458132 2.459437 2.460630 0.000000
6 F 2.414588 2.368281 2.365651 2.355690 4.350586
Upvotes: 2
Views: 173
Reputation: 58351
This might work for you (GNU sed):
sed '/Input orientation/h;//!H;$!d;x;s/^\(Input orientation.*Distance matrix[^\n]*\).*/\1/p;d' file
At each occurrence of Input orientation
overwrite the hold space (HS) with the current line, append following lines and delete all lines. At the end of the file, swap to the HS and remove lines following Distance matrix
and print.
Alternative, along the same lines but perhaps less memory intensive:
sed '/Input orientation/h;//!{x;/./G;x};$!d;x;s/\(Distance matrix[^\n]*\).*/\1/p;d' file
Upvotes: 0
Reputation: 67467
alternative awk
without tac
$ awk '/Input orientation/ {f=1}
f {a=a sep $0; sep=ORS}
/Distance matrix/ {f=0; b=a; a=sep=""}
END {print b}' file
transfer and reset the cache after each end tag and print the last one.
Upvotes: 0
Reputation: 2471
Another solution with sed whithout tac
sed ':B;$x;/Input/!d;x;s/.*//;;x;:A;/Distance/!{N;bA};h;N;s/.*\n//;bB' infile
Keep the text in the holdspace and delete it when we find a new one.
Upvotes: 0
Reputation: 23667
That is because, sed
would quit as soon as it sees q
. You need to qualify it again
$ tac ip.txt | sed -n '/Distance matrix/,/Input orientation/{p;/Input orientation/q}' | tac
Input orientation:
---------------------------------------------------------------------
Center Atomic Atomic Coordinates (Angstroms)
Number Number Type X Y Z
---------------------------------------------------------------------
1 6 0 Correct Correct Correct
2 1 0 Correct Correct Correct
3 1 0 Correct Correct Correct
4 1 0 Correct Correct Correct
5 17 0 Correct Correct Correct
6 9 0 Correct Correct Correct
---------------------------------------------------------------------
Distance matrix (angstroms):
With awk
tac ip.txt | awk '/Distance matrix/{f=1} f; /Input orientation/{exit}' | tac
See also: How to select lines between two patterns?
Upvotes: 1