Reputation: 22440
I've written a script in python to get certain from a text container. I used re
module to do the job. However, it is giving me unnecesary output along with the required ones.
How can I modify my expression to be stick to the lines I wanna grab?
This is my try:
import re
content = """
A Gross exaggeration,
-- Gross 5 90,630,08,
Gross 4 13,360,023,
Gross 2 70,940,02,
Luke gross is an actor
"""
for item in re.finditer(r'Gross(?:[\d\s,]*)',content):
print(item.group().strip())
Output I'm having:
Gross
Gross 5 90,630,08,
Gross 4 13,360,023,
Gross 2 70,940,02,
Output I wish to have:
Gross 4 13,360,023
Gross 2 70,940,02
Upvotes: 0
Views: 22
Reputation: 195573
I changed the regex string to r'(?:^\s*?)Gross[\d\s,]*?(?=,$)'
and added multiline flag (online regex here):
import re
content = """
A Gross exaggeration,
-- Gross 5 90,630,08,
Gross 4 13,360,023,
Gross 2 70,940,02,
Luke gross is an actor
"""
for item in re.finditer(r'(?:^\s*?)Gross[\d\s,]*?(?=,$)',content, flags=re.M):
print(item.group().strip())
Output is:
Gross 4 13,360,023
Gross 2 70,940,02
Upvotes: 1
Reputation: 1778
^\s*Gross[\d ,]*(?=,)
Will capture what you want.
I just tacked on ^
to signal the start of the line, used \s*
to indicate optional whitespace before "gross" and trimmed the ,
from the end. I also removed your \s
from your character class because it worked with new lines. I replaced it with a blank space.
Upvotes: 0