navinpai
navinpai

Reputation: 995

How to identify a sequential characters using Regex?

I have a string, and need to check if it has a sequence of characters. eg. abcde, or abcd

Let's say I need to flag strings that have a sequence of length greater than 3.

In other words, I need to flag abcpa, but not abpqx

Can I do this using RegEx?

Thanks

Upvotes: 2

Views: 7986

Answers (2)

speakr
speakr

Reputation: 4209

This regexp matches sequences with at least 3 consecutive characters:

/(?:abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz)/i

The following perl script checks for sequences of a specified number of consecutive characters:

#!/usr/bin/perl
use strict;
use warnings;
my ($len, $test) = @ARGV;
my $s = "abcdefghijklmnopqrstuvwxyz";
my $re = "";
for (0..length($s)-$len) {
    $re .= substr($s, $_, $len)."|";
}
chop $re;
exit 1 unless ($test =~ m/(?:$re)/i);

The script exits with error code 1 if no match was found and with error code 0 otherwise.

Call it like perl script.pl <min length of sequence> <string to test>.

Examples:

% perl script.pl 5 aaaabbbbeeeeehijklllmnppp && echo "match" || echo "no match"
match

% perl script.pl 6 aaaabbbbeeeeehijklllmnppp && echo "match" || echo "no match" 
no match

Upvotes: 1

Dave Sexton
Dave Sexton

Reputation: 11188

I think there is way you could do this with a regex. I have made the assumption that the sequences you are looking for must start at A. The example below in Powershell has a cut-down regex just using the first 8 characters just for speed and clarity and would need to be extended:

$re = "(?<=(?<=(?<=(?<=(?<=(?<=a)b?)c?)d?)e?)f?)g?"
"abcpa" -match $re
$matches # => "abc"
$matches[0].length # => 3

Not fully tested but I think it is OK.

Upvotes: 0

Related Questions