Reputation: 1769
I'd like regex expressions to use in a Visual Studio 2013 extension written in C#.
I'm trying to remove trailing whitespaces from a line while preserving empty lines. I'd also like to remove multiple empty lines. The existing line endings should be preserved (generally carriage return line feed).
So the following text (spaces shown as underscores):
hello_world__
___hello_world_
__
__
hello_world
Would become:
hello_world
___hello_world
hello_world
I've tried a number of different patterns to remove the trailing spaces but I either end up not matching the trailing spaces or losing the carriage returns. I haven't yet tried to remove the multiple empty lines.
Here's a couple of the patterns I've tried so far:
\s+$
(?<=\S)\s+$
Upvotes: 3
Views: 2930
Reputation:
Just as a punt, without the use of Regex, you could always split the document by its end of line marker and then feedback using TrimEnd
(as highlighted by Anton Semenov)...
(Assuming a text document read into a string...)
// Ascertain the linefeed...
string str = "This is a test \r\nto see if I can force \ra string to be broken \non multiple lines \r\n into an array.";
string[] t = str.Split(new string[] { "\r\n", "\r", "\n" } ,StringSplitOptions.RemoveEmptyEntries);
thediv.InnerHtml = str + "<br /><br />";
foreach(string s in t)
{
thediv.InnerHtml += s.TrimEnd() + "<br />";
}
I haven't timed this at all, but if you prefer to avoid the complications of Regex (which I do if I can - see below*), you should find this fast enough to do what you want.
* I avoid Regex if I can. That doesn't mean that I don't use it. Regex has its place, but I believe it to be a last resort tool for involved jobs, for instance complex flexible strings that adhere to a format - something where the alternative will generate large amounts of code. Keeping Regex to an absolute minimum aids the readability of your code.
Upvotes: 1
Reputation: 7530
The \s
includes the linefeed, I would search for just multiple blanks instead. I do not know the specifics of VS, but this should hopefully do it:
[" "]*?$
Upvotes: 0
Reputation: 1769
Thanks for the answers so far. None of them are quite right for what I need, but they've helped me come up with what I needed. I think the issue is that are some oddities with regex in VS2013 (see Using Regular Expressions in Visual Studio). These two operations work for me:
Replace \ +(?=(\n|\r?$))
with nothing.
Replace ^\r?$(\n|\r\n){2,}
with \r\n
.
Upvotes: 3
Reputation: 469
\ +(?=(\n|$))
Any number of space, and checking that after a newline coming OR end of line (last characters in your string/text). (of course multi line needs to be enabled and global mode)
Upvotes: 1
Reputation:
As separate operations -
Remove trailing whitespace any (?m)[^\S\r\n]+$
Remove trailing whitespace lines with text (?m)(?<=\S)[^\S\r\n]+$
Remove duplicate blank lines (along with whitespace trim)
# Find: (?>\A(?:[^\S\r\n]*\r\n)+)|(?>\r\n(?:[^\S\r\n]*(\r\n)){2,})
# Replace: $1\r\n
(?>
\A
(?: [^\S\r\n]* \r \n )+
)
|
(?>
\r \n
(?:
[^\S\r\n]*
( \r \n ) # (1)
){2,}
)
Upvotes: 0
Reputation: 626851
To remove multiple blank lines and trailing whitespace with
(?:\r\n[\s-[\rn]]*){3,}
and replace with \r\n\r\n
.
See demo
And to remove the remaining whitespace, you can use
(?m)[\s-[\r]]+\r?$
See demo 2
Upvotes: 1