Parse comments and values selectively from a text file using linux

Question

I wanted to parse a file with names and comments on top of some of the name blocks. If I had a file like:

Art
Boat
Road
Tree
Street

# Blah
Star
Car
Sun

Sock

# Comm1
# Comm2
Stop
Stick
# Comm
Stock
Dock

And I wanted to parse this file in a way so as to extract all names starting with 'S' with their corresponding comments. Corresponding comments are the immediately preceding comment block (one or more lines of comments) till a white space is encountered preceding it. Also one comment block applies to all entries following it till a white space or another comment block is encountered. So the output of the above input should be something like:

**Name      Comments**

Street
Star        # Blah
Sun         # Blah
Sock
Stop        # Comm1 # Comm2
Stick       # Comm1 # Comm2
Stock       # Comm

Can anyone suggest a good way to go about doing this (preferably using shell)? Would really appreciate it. Thanks!

PS: I apologize if I am not clear in my description, still new at this.

William Pursell · Accepted Answer

Assuming your blank lines contain no whitespace:

sed -n '/^#/H; /^S/{G; y/
/ /; p}; /^$/h' input

The first command (/^#/H) appends the current line (a comment) to the hold space. The next command appends the hold space (containing all the accumulated comments) to the current buffer, replaces all newlines with a single space, and then prints the line. The final command clears the hold space whenever a blank line is encountered.

EDIT (thanks blahdiblah)

The above does not reset the accumulator correctly when a new comment block is detected without a preceding blank line. This is ugly, but accounts for that:

sed -n '/^#/{h; bk}; :j /^S/{G; y/
/ /; p}; /^$/h; d; :k n; /^#/{ H; bk}; bj;' input

Parse comments and values selectively from a text file using linux

Answers (2)

Related Questions