Jamin
Jamin

Reputation: 11

How to assign number for a repeating pattern

I am doing some calculations using gaussian. From the gaussian output file, I need to extract the input structure information. The output file contains more than 800 structure coordinates. What I did so far is, collect all the input coordinates using some combinations of the grep, awk and sed commands, like so:

grep -A 7 "Input orientation:" test.log | grep -A 5 "C" | awk '/C/{print "structure number"}1' | sed '/--/d' > test.out

This helped me to grep all the input coordinates and insert a line with "structure number". So now I have a file that contains a pattern which is being repeated in a regular fashion. The file is like the following:

structure Number

4.176801 -0.044096 2.253823

2.994556 0.097622 2.356678

5.060174 -0.115257 3.342200

structure Number

4.180919 -0.044664 2.251182

3.002927 0.098946 2.359346

5.037811 -0.103410 3.389953

Here, "Structure number" is being repeated. I want to write a number like "structure number:1", "structure number 2" in increasing order.

How can I solve this problem?

Thanks for your help in advance.

Upvotes: 1

Views: 134

Answers (1)

mschilli
mschilli

Reputation: 1894

I am not familiar at all with a program called gaussian, so I have no clue what the original input looked like. If someone posts an example I might be able to give an even shorter solution.

However, as far as I got it the OP is contented with the output of his/her code besided that he/she wants to append an increasing number to the lines inserted with awk.

This can be achieved with the following line (adjusting the OP's code):

grep -A 7 "Input orientation:" test.log | grep -A 5 "C" | awk '/C/{print "structure number"++i}1' | sed '/--/d' > test.out

Addendum:

Even without knowing the actual input, I am sure that one can at least get rid of the sed command leaving that piece of work to awk. Also, there is no need to quote a single character grep pattern:

grep -A 7 "Input orientation:" test.log | grep -A 5 C | awk '/C/{print "structure number"++i}!/--/' > test.out

I am not sure since I cannot test, but it should be possible to let awk do the grep's work, too. As a first guess I would try the following:

awk '/Input orientation:/{li=7}!li{next}{--li}/C/{print "structure number"++i;lc=5}!lc{next}{--lc}!/--/' test.log > test.out

While this might be a little bit longer in code it is an awk-only solution doing all the work in one process. If I had input to test with, I might come up with a shorter solution.

Upvotes: 2

Related Questions