Reputation: 995
I have a string, and need to check if it has a sequence of characters. eg. abcde, or abcd
Let's say I need to flag strings that have a sequence of length greater than 3.
In other words, I need to flag abcpa, but not abpqx
Can I do this using RegEx?
Thanks
Upvotes: 2
Views: 7986
Reputation: 4209
This regexp matches sequences with at least 3 consecutive characters:
/(?:abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz)/i
The following perl script checks for sequences of a specified number of consecutive characters:
#!/usr/bin/perl
use strict;
use warnings;
my ($len, $test) = @ARGV;
my $s = "abcdefghijklmnopqrstuvwxyz";
my $re = "";
for (0..length($s)-$len) {
$re .= substr($s, $_, $len)."|";
}
chop $re;
exit 1 unless ($test =~ m/(?:$re)/i);
The script exits with error code 1 if no match was found and with error code 0 otherwise.
Call it like perl script.pl <min length of sequence> <string to test>
.
Examples:
% perl script.pl 5 aaaabbbbeeeeehijklllmnppp && echo "match" || echo "no match"
match
% perl script.pl 6 aaaabbbbeeeeehijklllmnppp && echo "match" || echo "no match"
no match
Upvotes: 1
Reputation: 11188
I think there is way you could do this with a regex. I have made the assumption that the sequences you are looking for must start at A. The example below in Powershell has a cut-down regex just using the first 8 characters just for speed and clarity and would need to be extended:
$re = "(?<=(?<=(?<=(?<=(?<=(?<=a)b?)c?)d?)e?)f?)g?"
"abcpa" -match $re
$matches # => "abc"
$matches[0].length # => 3
Not fully tested but I think it is OK.
Upvotes: 0