Reputation: 41
I have a requirement to extract the number of characters before and after REGEX match.
For example:
Input : ABCDEFGHIJK//MNOPQRST
Output : IJK//MNOPQ
Input : zzzABCDEFGHIJK//MNOPQRST
Output :
I want only first 3 characters before "//" and 5 characters after "//". Also exclude line that starts with zzz.
The code currently I am using to search //
:
^(?!.*zzz)?=.{0,3}//.{0,100}[a-zA-Z0-9])(?=\S+$).{2,5000} --- Not working
(?=.{0,3}//.{0,100}[a-zA-Z0-9])(?=\S+$).{2,5000} --- Working
https://regex101.com/r/ry6Y09/1 --- Regex demo
I need to specify limit.
Upvotes: 4
Views: 4569
Reputation: 626853
To get three chars before //
and five chars after //
, you can use
.{0,3}//.{0,5}
.{3}//.{5}
See the regex demo #1 and regex demo #2.
Mind that .{0,3}//.{0,5}
is good to use when you expect matches that have fewer chars before //
and after //
, just because they are close to the start / end of string.
The .{3}//.{5}
regex will not match in a ab//abcde
string, for example, as it will require exactly three and five chars before/after //
.
Depending on how you declare the regex, you might need to escape /
.
More details:
.{0,3}
- zero to three chars other than line break chars.{3}
- thre chars other than line break chars//
- a //
string.{5}
- five chars other than line break chars.{0,5}
- zero to five chars other than line break charsNow, answering your edit and comment, if you want to extract a .{3}//.{5}
substring from a string that does not start with zzz
and contains 2 to 5000 non-whitespace only chars you can use
^(?!zzz)(?=\S{2,5000}$).*(.{3}//.{0,100})(?!\w)
^(?!zzz)(?=\S{2,5000}$).*?(.{3}//.{0,100})(?!\w)
Grab Group 1. See the regex demo. Details:
^
- start of string(?!zzz)
- no zzz
allowed at the start of a string(?=\S{2,5000}$)
- the string must only consist of two to 5000 non-whitespace chars.*?
- match/consume any zero or more chars other than line break chars, as few as possible (.*
consumes as many as possible)(.{3}//.{0,100})
- any 3 chars other than line break chars, //
, and any 0 to 100 chars other than line break chars(?!\w)
- not followed with a word char. Remove if this check is not required.Upvotes: 3