kgb0716
kgb0716

Reputation: 3

Regex Expression on IDs of two lengths

I'm using regex on a large block of text that has several IDs that I am trying to extract, here is an example of them:

476iDD5100A9E110A2FA
155i6F1388BE08C6940D
3155i6F1388BE08C6940D

"i" is always present at either the 4 or 5th character. The strings are 20 characters if the 4th character is an "i" and 21 characters when the 5th character is an "i". 16 characters always follow the "i".

Here is how it looks in total in the line of text:

id="833i8E8BBB9BB1DA748D" size="large" sourcetype="new"

I wrote the following expression in .NET:

([0-9]{3,4}[i][0-Z]{16})+

It does great with the 20 character IDs, but the 21 character IDs have the first digit truncated down to 20. How do I modify my expression to grab both the 20 and 21 character version of these IDs?

Upvotes: 0

Views: 176

Answers (2)

user7571182
user7571182

Reputation:

You may try the regex below:

\b\d{3,4}i[0-9A-Za-z]{16}\b

Explanation of the above regex:

\b - Represents a word boundary.

\d{3,4} - Matches digit 3 to 4 times.

i - Matches i literally.

[a-zA-Z0-9]{16} - Matches a word character 16 times.

pictorial representation

You can find the demo of the above regex in here.

Upvotes: 1

Kao
Kao

Reputation: 106

Change the {16} at the end to {16,17}, which will let you capture both.

If you wanted to be more strict, then you have to make an or to include both expressions when the i is at the 4th or at the 5th position, making the length vary at the end.

([0-9]{3}[i][0-Z]{16,17}|[0-9]{4}[i][0-Z]{15,16})+

Upvotes: 0

Related Questions