user2465164
user2465164

Reputation: 937

Length of a Regex Match

I have an array of data that is being filtered into different arrays via regular expressions. One of these arrays is for containing data that is considered "too long" for my program. Not all of these "too long" instances are the same length, but I would like to shorter them.

I want something like DRB1*01:02.

Too long is anything like DRB1*01:02:03 or longer, including things like DRB1*01:02:03:abc:29

However, the letters at the front will not always be the same length. I will be dealing with things such as A*1:01:02 or TIM*01:02. So I am specifically looking at the sets of two integers and their preceding colon, and perhaps any letters that may follow in data that is "too long". I want the letters out front, the star, and 2 sets of numbers and the colon between them.

I want to use a regular expression to find pieces of data that are "too long", and then measure the length of the data it matches, and slice backward to remove it.

Something so that it will inform me that DRB1*01:02:03 matches *01:02:03 and the length of that is 9. Same for anything like DRB1*01:02:03:abc:29, where it matches *01:02:03:abc:29 and tells me the length is 16. NOT matching a word by it's length.

Is there any way to find the length of what part of the data the regular expression has matched? Including cases where the regular expression does not mark a definite end?

I am using JavaScript.

Upvotes: 0

Views: 245

Answers (2)

Barmar
Barmar

Reputation: 781004

Use a capture group to get the part that matches after the *:

var matches = str.match(/^[A-Z]+(\*.*)$/);
if (matches) {
    var len = matches[1].length;
    alert("It's "+len+" characters long");
}

Upvotes: 1

Mark Lakata
Mark Lakata

Reputation: 20838

perlish regex

 if (/([A-Z0-9]+\*\d+:\d+)(.+)/) {
    print "too long, prefix:$1 extra stuff:$2 length:".length($2)."\n";
 }

Upvotes: 0

Related Questions