Reputation: 937
I have an array of data that is being filtered into different arrays via regular expressions. One of these arrays is for containing data that is considered "too long" for my program. Not all of these "too long" instances are the same length, but I would like to shorter them.
I want something like DRB1*01:02.
Too long is anything like DRB1*01:02:03 or longer, including things like DRB1*01:02:03:abc:29
However, the letters at the front will not always be the same length. I will be dealing with things such as A*1:01:02 or TIM*01:02. So I am specifically looking at the sets of two integers and their preceding colon, and perhaps any letters that may follow in data that is "too long". I want the letters out front, the star, and 2 sets of numbers and the colon between them.
I want to use a regular expression to find pieces of data that are "too long", and then measure the length of the data it matches, and slice backward to remove it.
Something so that it will inform me that DRB1*01:02:03 matches *01:02:03 and the length of that is 9. Same for anything like DRB1*01:02:03:abc:29, where it matches *01:02:03:abc:29 and tells me the length is 16. NOT matching a word by it's length.
Is there any way to find the length of what part of the data the regular expression has matched? Including cases where the regular expression does not mark a definite end?
I am using JavaScript.
Upvotes: 0
Views: 245
Reputation: 781004
Use a capture group to get the part that matches after the *
:
var matches = str.match(/^[A-Z]+(\*.*)$/);
if (matches) {
var len = matches[1].length;
alert("It's "+len+" characters long");
}
Upvotes: 1
Reputation: 20838
perlish regex
if (/([A-Z0-9]+\*\d+:\d+)(.+)/) {
print "too long, prefix:$1 extra stuff:$2 length:".length($2)."\n";
}
Upvotes: 0