Reputation: 2030
I'm trying to create the regex that highlights any group of two consecutive letters where the latter is the capitalized version of the former (which is lowercase).
For example, in the string
aSsdDsaAdfF
I want dD, aA and fF to match my given regex. To put it in another way, the string with highlights shouls be
aSsdDsaAdfF
I think I need to use backreferences, but I don't know how.
Could anybody please give me a way to solve this issue?
Upvotes: 4
Views: 509
Reputation: 63
Ahoy!
I didnt check whether the previous solution works, it looks like a nice one liner. However strangely when I looked at this problem I didnt think of regular expressions, I thought of c-style character arrays. I came up with a solution that was not a one line regular expression, but a bit longer and verbose that split the string into a character array and looked for a case insensitive match and a case sensitive fail.
It might be a little more descriptive and easier to build around.
#!/usr/bin/perl -w
#check if latter character is a capitalized version of the former
my $s = "aSsdDsaAdfF";
my @s = split(//,$s); #split string into an old fashioned c-style character array
my @matchPos; #array to keep position of all matches
my @matchChar; #array to keep matched character
my $count=0; #loop counter
my $previousLetter; #using lookbehind
#matches must satisfy these three conditions
my $caseInsensitiveMatch;
my $caseSensitiveFail;
my $currentCharacterIsUpper;#latter has to be capitalized version of former, and not vise versa i.e. s S will match and S s will fail
print "String is \"$s\"\n";
foreach(@s){
if($count==0){ #skip the first letter then use look behind
$count++;
next;
}
#lookbehind character
$previousLetter = $s[$count-1];
#check if case insensitive compare matches AND case sensitive compare fails
$caseInsensitiveMatch = $_ =~ /^$previousLetter$/i;
$caseSensitiveFail = $_ !~ /^$previousLetter$/;
#that means this is the same character, but one is upper, one is lower
#make sure current char is upper, and lookbehind is lower
$currentCharacterIsUpper = $_ =~ /^[A-Z]$/;
#satisfy all three conditions this is a match
if($caseInsensitiveMatch && $caseSensitiveFail && $currentCharacterIsUpper){
print "match at position $count characters $previousLetter and $_\n";#records match at second character
push(@matchPos, $count);
push(@matchChar, $_);
}
$count++
}
print "Matches found in position: \t\t\t@matchPos\n";
print "Characters matched are as follows: \t\t@matchChar\n";
Output looks like this
$ perl consecutiveCharacters.pl
String is "aSsdDsaAdfF"
match at position 4 characters d and D
match at position 7 characters a and A
match at position 10 characters f and F
Matches found in position: 4 7 10
Characters matched are as follows: D A F
Upvotes: 1
Reputation:
One way is this (?-i:([a-z])(?=[A-Z]))(?i:\1)
which uses entirely localized case modifiers that don't affect anything
else.
Explanation
(?-i: # Cluster group with 'case sensitive' scoped modifier
( [a-z] ) # (1), Lower-case
(?= [A-Z] ) # Lookahead, Upper-case
) # End cluster
(?i: # Cluster group with 'case insensitive' scoped modifier
\1 # Backreference to group 1
# ( previous assertion guarantees this
# can only be the Upper-Cased version of group 1)
) # End cluster
Upvotes: 5