Sha
Sha

Reputation: 117

python: extracting (regex) pattern in a file without going through line by line (multiline search)

I can extract a particualr pattern by reading mystring.txt file line by line and checking the line against re.search(r'pattern',line_txt) method.

Following is the mystring.txt

`

Client: //home/SCM/dev/applications/build_system/test_suite_linux/unit_testing



Stream: //MainStream/testing_branch

Options:    dir, norm accel, ddl



SubmitOptions:  vis, dir, cas, cat

`

using python, I can get the stream name as //MainStream/testing_branch

import re 
with open("mystring.txt",'r') as f:
    mystring= f.readlines()
    for line in mystring:
        if re.search(r'^Stream\:',line):

            stream_name = line.split('\t')[1]
            print stream_name

instead of going line by line in a loop, how is it possible to extract the same information by only using the re module?

Upvotes: 3

Views: 10457

Answers (3)

Ishaq Khan
Ishaq Khan

Reputation: 173

Here is the solution

f = open("mystring.txt").read()

import re

got = re.findall("Stream: .+\n", f)

got = got[0].strip()

print(got.split(": ")[1])

Upvotes: 0

rock321987
rock321987

Reputation: 11032

You can read the file in one go and use re.findall(beware if the file is too large, loading it to main memory will not be good idea)

import re
content = open("input_file").read()
print(re.findall("^Stream: (.*)", content, re.M))

Upvotes: 4

UltraInstinct
UltraInstinct

Reputation: 44434

Yes, you can use: re.MULTILINE with re.search(..).

>>> import re
>>> re.search(r'^Stream\:\s([^\n]+)', f.read(), re.MULTILINE).group(1)
'//MainStream/testing_branch'

Upvotes: 2

Related Questions