Reputation: 3
I have a .txt file with two types of paragraphs:
Some statements and numbers (02) and such followed by a return
With some more stuff followed by two returnsThen a single line paragraph that is followed by two returns
Along with some more double line text return
some more text.
I want to remove all single line paragraphs from the text file. So that the result is:
Some statements and numbers (02) and such followed by a return
With some more stuff followed by two returnsAlong with some more double line text return
some more text
I have been attempting to do this with sed and awk, but I keep running into problems coming up with a regex that will look for a newline followed by some characters and ending in two consecutive newlines \n\n.
Is there anyway way to do this with a one liner or am I going to have to write a script to read in line by line and determine the length of the paragraph and strip it out that way?
Thanks.
Upvotes: 0
Views: 707
Reputation: 161834
awk -F '\n' -v RS='' -v ORS='\n\n' 'NF>1' input.txt
RS
is set to the empty string, each record always ends at the first blank line encountered.RS
is set to the empty string, and FS
is set to a single character, the newline character always acts as a field separator.Upvotes: 1
Reputation: 247012
I tend to reach for Perl for paragraph-oriented parsing:
perl -00 -lne 'print if tr/\n/\n/ > 0'
Upvotes: 1