Exclude leading characters within regex group

I want to extract a group with a fixed length from a string, but then ignore leading zeroes.

Example:

String: 1a2300245filler060403105543a
            ^^^^^      ^^^^^^

Current regex: .{4}(?<part_x>[\d]{5})filler(?<part_y>[\d]{6})

This gives me :

part_x = 00245

part_y = 060403

Is there some way to remove the leading zeroes from the grouping to get this?

part_x = 245

part_y = 60403

Note that the initial length of part_x and part_y is fixed (5 and 6 respectively). I just want to trim the leading zeroes somehow in the regex.

Upvotes: 3

Views: 141

Answers (1)

jaytea
jaytea

Reputation: 1959

It's a little awkward to selectively match and capture overlapping subexpressions like this, but here's a trick you can use in this particular case and cases like it:

.{4}(?=\d{5}(.++))0{0,4}(?<part_x>\d+(?=\1))filler(?=\d{6}(.*+))0{0,5}(?<part_y>\d+(?=\3))

The trick here is (?=\d{5}(.++)) peeks ahead of the current matching point to ensure 5 digits are present (as you mandated), but then (.++) goes further and captures the rest of the subject string for later testing. Then, potential leading '0's are consumed outside of the capture, leaving (?<part_x>\d+(?=\1)) to match the rest of the digits, looking ahead once more to verify that it stops matching digits at the position where \1, captured earlier, follows.

part_x and part_y should then be populated as required.

If you want something that's conceptually easier to understand, you can use the following to match, for example, 5 digits and capture without leading zeroes:

(?|0([1-9]\d{3})|00([1-9]\d\d)|000([1-9]\d)|0000(\d))

Upvotes: 3

Related Questions