Reputation: 1628
I have a file which looks like
This is a line which is a continuation from above .......
This is line I want to match ....
This is another line I want to match ....
This is yet another line I want to match ....
This is some regular text. Blah ...
Continuation of the regular text above ...
I want to "compact" lines preceded and succeeded by blank lines. Like this
This is line I want to match ....
This is another line I want to match ....
This is yet another line I want to match ....This is some regular text. Blah ...
Continuation of the regular text above
I tried to match the lines which are preceded and succeeded by newline by using
re.findall(r'\n\n[\w ]+\n\n')
but that failed. Any suggestions?
Upvotes: 0
Views: 309
Reputation: 16273
PCRE isn't available in Python, so you'd have to go with something like the following:
/(?=\r?\n|\x0b|\f|\r|\x85)(\r?\n|\x0b|\f|\r|\x85)(.+(\r?\n|\x0b|\f|\r|\x85))(?=\r?\n|\x0b|\f|\r|\x85)/g
Python Live Demo: http://regex101.com/r/xL8bF1 (Please see pcrepattern specification for the complex line feed stuff)
PCRE regular expression that should do what you want:
/(?=\R)\R(.+\R)(?=\R)/g
PCRE (PHP) Live Demo: http://regex101.com/r/aO8yA7
PS: Make use of the visualize whitespace feature over at regex101 for better understanding of the substitution result.
Upvotes: 4
Reputation: 439377
Building on @Fleshgrinder's excellent approach to perform the substitution desired:
re.sub(r'(?=\n)\n(.+)\n(?=\n)', r'\1\n', inputString)
If you also need to make it work with input that has \r\n
line endings:
re.sub(r'(?=\r?\n)\r?\n(.+)(\r?\n)(?=\r?\n)', r'\1\2', inputString)
Assuming a Unix system and an input file named in.txt
, you can test it from the command line as follows:
python -c \
"import re,sys; print re.sub(r'(?=\n)\n(.+)\n(?=\n)', r'\1\n', sys.argv[1])" \
"$(<in.txt)"
Upvotes: 1
Reputation: 1070
A simple solution using Perl (assuming the file in question is named "in.txt") -
perl -e 'undef $/; while ($file=<>) {$file=~s/\n\n(.*)(\n\n)/\n$1\n/g; print $file}' in.txt
Basically, read in the whole file as a single string in Perl and then apply the substitution function in Perl to the whole string.
(Note - I have assumed that this is a Unix system. You might want to add an extra optional check for carriage returns for Windows machines as per @Fleshgrinder 's answer.)
Upvotes: 0