Reputation: 11
I have a string, which can contain 10 or more characters ([0-9a-zA-Z]), e.g.: abcdefghij12345
I want to catch following characters in groups:
Group 1-3 works, but how can a get postion "6 - last postion of string" in Group 4?
What I already have?
r'^([0-9a-zA-Z]{2})([0-9a-zA-Z]{2})([0-9a-zA-Z]{6})'
I expect to get all four groups with one Regex expression. How to expand my expression to get additionally group 4?
Edit: Additionally following Regex is needed for a string of 72 and more characters
I want to catch following characters in groups:
Group 1: Character position "1 and 2"
Group 2: Character position "3 and 4"
Group 3: Character position "5 and 6" ...
Group 16: Character position "31 and 32"
Group 17: Character position "33 - 40"
Group 18: Character position "41 and 42"
Group 19: Character position "33 - 40"
Group 20: Character position "12 - Last position of string"
String (72 char): 294592522929354526532268626626426854242342362676256672666267626726672667
r'^([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{2})([\da-zA-Z]{8})([\da-zA-Z]{2})([\da-zA-Z]{8})'
Upvotes: 1
Views: 126
Reputation: 37847
Since it's an index/position issue, why not just using classical slicing with a tuple-comp ?
S = "abcdefghij12345"
g1, g2, g3, g4 = (S[i:j] for i, j in [(0, 2), (2, 4), (4, 10), (5, None)])
Output :
ab # <- group1
cd # <- group2
efghij # <- group3
fghij12345 # <- group4
Upvotes: 0
Reputation: 117812
You could use a positive lookahead:
^([\da-zA-Z]{2})([\da-zA-Z]{2})(?=([\da-zA-Z]{6})).([\da-zA-Z].*)$
^
- start of line anchor([\da-zA-Z]{2})
- first capture group, pos 1-2([\da-zA-Z]{2})
- second capture group, pos 3-4(?=([\da-zA-Z]{6}))
- positive lookahead, third capture, pos 5-10.([\da-zA-Z].*)
- discard one character and capture the rest as forth capture, pos 6-end$
- end of line anchorUpvotes: 0