Amelio Vazquez-Reina
Amelio Vazquez-Reina

Reputation: 96378

Remove single line breaks, keep "empty" lines

Say I have text like the following text selected with the cursor:

This is a test. 
This 
is a test.

This is a test. 
This is a 
test.

I would like to transform it into:

This is a test. This is a test

This is a test. This is a test

In other words, I would like to replace single line breaks by spaces, leaving empty lines alone.

I thought something like the following would work:

RemoveSingleLineBreaks()
{
  ClipSaved := ClipboardAll
  Clipboard =
  send ^c
  Clipboard := RegExReplace(Clipboard, "([^(\R)])(\R)([^(\R)])", "$1$3")    
  send ^v
  Clipboard := ClipSaved
  ClipSaved = 
}

But it doesn't. If I apply it to the text above, it yields:

This is a test. This is a test.
This is a test. This is a test.

which also removed the "empty line" in the middle. This is not what I want.

To clarify: By an empty line I mean any line with "white" characters (e.g. tabs or white spaces)

Any thoughts how to do this?

Upvotes: 8

Views: 3984

Answers (4)

Bob
Bob

Reputation: 16581

RegExReplace(Clipboard, "([^\r\n])\R(?=[^\r\n])", "$1$2")

This will strip single line breaks assuming the new line token contains either a CR or a LF at the end (e.g. CR, LF, CR+LF, LF+CR). It does not count whitespace as empty.

Your main problem was the use of \R:

\R inside a character class is merely the letter "R" [source]

The solution is to use the CR and LF characters directly.


To clarify: By an empty line I mean any line with "white" characters (e.g. tabs or white spaces)

RegExReplace(Clipboard, "(\S.*?)\R(?=.*?\S)", "$1")

This is the same as the above one, but counts whitespace as empty. It works because it accepts all characters except line breaks non-greedily (*?) up to the first non-whitespace character both behind and in front of the linebreaks, since the . does not match line breaks by default.

A lookahead is used to avoid 'eating' (matching) the next character, which can break on single-character lines. Note that since it is not matched, it is not replaced and we can leave it out of the replacement string. A lookbehind cannot be used because PCRE does not support variable-length lookbehinds, so a normal capture group and backreference are used there instead.


I would like to replace single line breaks by spaces, leaving empty lines alone.

If you want to replace the line break with spaces, this is more appropriate:

RegExReplace(Clipboard, "(\S.*?)\R(?=.*?\S)", "$1 ")

This will replace single line breaks with a space.


And if you wanted to use lookbehinds and lookaheads:


Strip single line breaks:

RegExReplace(Clipboard, "(?<=[^\r\n\t ][^\r\n])\R(?=[^\r\n][^\r\n\t ])", "")


Replace single line breaks with spaces:

RegExReplace(Clipboard, "(?<=[^\r\n\t ][^\r\n])\R(?=[^\r\n][^\r\n\t ])", " ")

For some reason, \S doesn't seem to work in lookbehinds and lookaheads. At least, not with my testing.

Upvotes: 6

tatoosh
tatoosh

Reputation: 21

#SingleInstance force

#v::
    Send ^c
    ClipWait
    ClipSaved = %clipboard%

    Loop
    {
        StringReplace, ClipSaved, ClipSaved, `r`n`r`n, `r`n, UseErrorLevel
        if ErrorLevel = 0  ; No more replacements needed.
            break
    }
    Clipboard := ClipSaved
    return

Upvotes: 1

SouthStExit
SouthStExit

Reputation: 201

I believe this will work:

text=
(
This is a test. 
This 
is a test.

This is a test. 
This is a 
test.
)
MsgBox %    RegExReplace(text,"\S\K\v(?=\S)",A_Space)

Upvotes: 2

mihai
mihai

Reputation: 38573

Clipboard := RegExReplace(Clipboard, "(\S+)\R", "$1 ")

Upvotes: 1

Related Questions