Reputation: 955
Suppose I have the following data:
some_string = """
Dave Martin
615-555-7164
173 Main St., Springfield RI 559241122
[email protected]
Charles Harris
800-555-5669
969 High St., Atlantis VA 340750509
[email protected]
"""
I used the following to find a pattern:
import re
pattern = re.compile(r'\d\d\d(-|\.)\d\d\d(-|\.)\d\d\d\d')
matches = pattern.finditer(some_string)
Printing the re
object shows:
for match in matches:
print(match)
<re.Match object; span=(21, 33), match='615-555-7164'>
<re.Match object; span=(131, 143), match='800-555-5669'>
I want to extract the span and match fields. I found this link Extract part of a regex match that shows how to use group()
:
nums = []
for match in matches:
nums.append(match.group(0))
I get the following result:
print(nums)
['615-555-7164', '800-555-5669']
Similar to the other StackOverlow thread above, how can I extract the span?
This thread was marked for deletion by someone and then it was deleted. The justification for deletion was that I was seeking advice on software... which I was not. https://i.imgur.com/sbCfekf.png
Upvotes: 0
Views: 3075
Reputation: 2380
If you are just looking for the tuple storing the begin and end index of the matches, just use span
. Note that the parameter for span
works the same way as group
as they both take the match group index, and index 0
stores the entire match (while in your case index 1
and 2
match (-|\.)
).
for match in matches:
print(match.span(0))
Output:
(13, 25)
(103, 115)
And for extracting the match fields, yes, your approach works just fine. It will be better if you extract both the match fields and span in the same loop.
nums = []
spans = []
for match in matches:
nums.append(match.group(0))
spans.append(match.span(0))
Besides, please be aware that finditer
gives you an Iterator
, which means that once it reaches the end of the iterable, it's done. You will need to create a new one if you want to iterate it through again.
Upvotes: 3