Reputation: 129
I wanted to parse a file with names and comments on top of some of the name blocks. If I had a file like:
Art
Boat
Road
Tree
Street
# Blah
Star
Car
Sun
Sock
# Comm1
# Comm2
Stop
Stick
# Comm
Stock
Dock
And I wanted to parse this file in a way so as to extract all names starting with 'S' with their corresponding comments. Corresponding comments are the immediately preceding comment block (one or more lines of comments) till a white space is encountered preceding it. Also one comment block applies to all entries following it till a white space or another comment block is encountered. So the output of the above input should be something like:
**Name Comments**
Street
Star # Blah
Sun # Blah
Sock
Stop # Comm1 # Comm2
Stick # Comm1 # Comm2
Stock # Comm
Can anyone suggest a good way to go about doing this (preferably using shell)? Would really appreciate it. Thanks!
PS: I apologize if I am not clear in my description, still new at this.
Upvotes: 0
Views: 445
Reputation: 212248
Assuming your blank lines contain no whitespace:
sed -n '/^#/H; /^S/{G; y/\n/ /; p}; /^$/h' input
The first command (/^#/H
) appends the current line (a comment) to the hold space.
The next command appends the hold space (containing all the accumulated comments) to the current buffer, replaces all newlines with a single space, and then prints the line. The final command clears the hold space whenever a blank line is encountered.
EDIT (thanks blahdiblah)
The above does not reset the accumulator correctly when a new comment block is detected without a preceding blank line. This is ugly, but accounts for that:
sed -n '/^#/{h; bk}; :j /^S/{G; y/\n/ /; p}; /^$/h; d; :k n; /^#/{ H; bk}; bj;' input
Upvotes: 1
Reputation: 33991
Here's some slightly inelegant awk that does the job:
awk '/^$/ {ca=""; cp=""} /^#/ {ca=ca " " $0} /^S/ && ca {cp=ca; ca=""} /^S/ {print $0 " " cp}' < input.txt > output.txt
There are two stores: the comment accumulator, ca
, and the comment print buffer, cp
.
There's probably a more elegant way to do this, and this doubtless has problems (e.g., putting a space at the end of lines with no comments), but it'll get you started.
Upvotes: 1