Reputation: 932
I'm trying to count the number of extra spaces, including trailing and leading spaces in a string. There are a lot of suggestions out there, but none of them get the count exactly right.
Example ( _ indicates space)
__this is a string__with extra spaces__
should match 5 extra spaces.
Here's my code:
if (my @matches = $_[0] =~ m/(\s(?=\s)|(?<=\s)\s)|^\s|\s$/g){
push @errors, {
"error_count" => scalar @matches,
"error_type" => "extra spaces",
};
}
The problem with this regex is that it counts spaces in the middle twice. However, if I take out one of the look-ahead/look-behind matches, like so:
$_[0] =~ m/\s(?=\s)|^\s|\s$/g
It won't count two extra spaces at the beginning of a string. (My test string would only match 4 spaces.)
Upvotes: 1
Views: 280
Reputation: 14047
With three simple regular expressions (and replacing spaces with underscores for clarity) you could use:
use strict;
use warnings;
my $str = "__this_is_a_string__with_extra_underscores__";
my $temp = $str;
$temp =~ s/^_+//;
$temp =~ s/_+$//;
$temp =~ s/__+/_/g;
my $num_extra_underscores = (length $str) - (length $temp);
print "The string '$str' has $num_extra_underscores extraunderscores\n";
Upvotes: 0
Reputation: 30273
Try
$_[0] =~ m/^\s|(?<=\s)\s|\s(?=\s*$)/g
This should match
In other words, for your example, here's what each of the three cases would match:
__this is a string _with extra spaces__
12 2 32
This also works for the edge case of all spaces:
_____
12222
Upvotes: 2
Reputation: 4795
This regex should match all unnecessary individual spaces
^( )+|( )(?= )|( )+$
or
$_[0] =~ m/^( )+|( )(?= )|( )+$/g
You could change the spaces to \s but then it'll count tabs as well.
Breakdown:
^( )+
Match any spaces connected to the start of the line
( )(?= )
Match any spaces that are immediately followed by another space
( )+$
Match any spaces connected to the end of the line
Upvotes: 0