command line method to remove single line paragraphs from text file

Question

I have a .txt file with two types of paragraphs:

Some statements and numbers (02) and such followed by a return
With some more stuff followed by two returns

Then a single line paragraph that is followed by two returns

Along with some more double line text return
some more text.

I want to remove all single line paragraphs from the text file. So that the result is:

Some statements and numbers (02) and such followed by a return
With some more stuff followed by two returns

Along with some more double line text return
some more text

I have been attempting to do this with sed and awk, but I keep running into problems coming up with a regex that will look for a newline followed by some characters and ending in two consecutive newlines .

Is there anyway way to do this with a one liner or am I going to have to write a script to read in line by line and determine the length of the paragraph and strip it out that way?

Thanks.

kev · Accepted Answer

awk -F '
' -v RS='' -v ORS='

' 'NF>1' input.txt

When RS is set to the empty string, each record always ends at the first blank line encountered.
When RS is set to the empty string, and FS is set to a single character, the newline character always acts as a field separator.

[read more]

command line method to remove single line paragraphs from text file

Answers (2)

Related Questions