Reputation: 873
I am trying to figure out the syntax for regular expression that would match 4 alphanumeric characters, where there is at least one letter. Each should be wrapped by: >
and <
but I wouldn't like to return the angle brackets.
For example when using re.findall
on string >ABCD<>1234<>ABC1<>ABC2
it should return ['ABCD', 'ABC1']
.
1234
- doesn't have a letter
ABC2
- is not wrapped with angle brackets
Upvotes: 2
Views: 4107
Reputation: 11
import re
sentence = ">ABCD<>1234<>ABC1<>ABC2"
pattern = "\>((?=[a-zA-Z])(.){4})\<"
m = [m[0] for m in re.findall(pattern, sentence)]
#outputs ['ABCD', 'ABC1']
Upvotes: 0
Reputation: 784898
You may use this lookahead based regex in python with findall
:
(?i)>((?=\d*[a-z])[a-z\d]{4})<
Code:
>>> regex = re.compile(r">((?=\d*[a-z])[a-z\d]{4})<", re.I)
>>> s = ">ABCD<>1234<>ABC1<>ABC2"
>>> print (regex.findall(s))
['ABCD', 'ABC1']
RegEx Details:
re.I
: Enable ignore case modifier>
: Match literal character >
(
: Start capture group
(?=\d*[a-z])
: Lookahead to assert we have at least one letter after 0 or more digits[a-z\d]{4}
: Match 4 alphanumeric characters)
: End capture group<
: Match literal character <
Upvotes: 5