Reputation:
My understanding is that /[^\A] +/mg
will match globally one or more spaces occurring other than at the beginning of the string or just after newline.
Apparently, I'm wrong.
#!/usr/bin/env perl
use strict;
use warnings;
my $str = " word1 word2\n word3 word4 word5\n";
print "str before = $str\n";
$str =~ s/[^\A] +/ /mg;
print "str after = $str\n";
Output:
str before = word1 word2
word3 word4 word5
str after = word word2 word word word5
The desired output is:
str before = word1 word2
word3 word4 word5
str after = word1 word2
word3 word4 word5
So the leading spaces are preserved in number but multiple spaces occurring after the beginning of each line are reduced to a single space.
I'm not finding what I'm looking for in perldoc perlretut
nor perldoc perlre
(even after searching through all the instances of "[^" with /\[\^
). Many thanks, in advance.
Upvotes: 0
Views: 2920
Reputation: 126732
As m.buettner says, a regex like [...]
is a character class and contain only characters, not patterns. In fact your code generates the warning
Unrecognized escape \A in character class
But a string of spaces that's not at the start of the line is a string of spaces preceded by a non-space, so all you need to write is this.
use strict;
use warnings;
my $str = " word1 word2\n word3 word4 word5\n";
print qq(String before = "$str"\n);
$str =~ s/[^ ]\K +/ /g;
print qq(String after = "$str"\n);
output
String before = " word1 word2
word3 word4 word5
"
String after = " word1 word2
word3 word4 word5
"
Upvotes: 0
Reputation: 44269
I think you cannot use \A
in a character class, since it is not a character. You could go with two negative lookaheads though:
$str =~ s/(?<!^)(?<! ) +/ /mg;
That makes sure that the match can neither start after the beginning of a line nor after another space. The latter condition is important, otherwise if you have multiple spaces at the beginning of a line, the regex would simply start matching from the second one.
By the way, to increase readability when using literal space characters in regular expressions, a neat trick is to wrap them in a character class:
$str =~ s/(?<!^)(?<![ ])[ ]+/ /mg;
Upvotes: 2