Reputation: 33

Regex to find substring starting with [ ]

The below is the sample substring present in a much larger string (detaildesc_final) that I have obtained. I need to use a regex search across the string so that I can retrieve all the lines that begin with " [] " (The two square brackets I mean) from the [Data] Section. All lines should be retrieved in the [Data] section until the [Logs] line is encountered.

[Data]

[] some text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[Logs]

I'm using Python to work the code and I've used the following command (which clearly is incorrect).

re.findall(r'\b\\[\\]\w*', detaildesc_final)

I need the result to be in the following format:

some text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

I have already looked a lot online and I could figure out to find any line starting with a single double character instead of two ( [] in this case). Any help would be greatly appreciated. Thank you.

Upvotes: 0

Answers (4)

Aaditya Ura

Reputation: 12679

You need positive look behind :

import re

pattern=r'(?<=\[\])(.\w.+)'

string_1="""[Data]

[] some text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[Logs]"""


match=re.finditer(pattern,string_1,re.M)
for item in match:
    print(item.group(1))

output:

 some text
 some_other_text
 some_other_text
 some_other_text
 some_other_text
 some_other_text
 some_other_text
 some_other_text
 some_other_text
 some_other_text
 some_other_text
 some_other_text

Regex explanation :

Positive Lookbehind (?<=\[\])

It tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there.

\[ matches the character [ literally (case sensitive)
\] matches the character ] literally (case sensitive)
. matches any character (except for line terminators)
\w matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)

Upvotes: 1

Transhuman

Reputation: 3547

import re
re.findall(r'\[\] (.*)\n\n', detaildesc_final)

Output:

['some text',
 'some_other_text',
 'some_other_text',
 'some_other_text',
 'some_other_text',
 'some_other_text',
 'some_other_text',
 'some_other_text',
 'some_other_text',
 'some_other_text',
 'some_other_text',
 'some_other_text']

Upvotes: 1

Mahesh Karia

Reputation: 2055

import re

str = """
[Data]

[] some text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[] some_other_text

[Logs]
"""


print re.sub("([[a-zA-Z ]{0,}][ ]?)", '',str)

output:

some text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

some_other_text

Upvotes: 1

Krzysztof Szularz

Reputation: 5249

Don't over-complicate things.

for line in detaildesc_final.split('\n'):
    if line.startswith('[]'):
        do_something()

Upvotes: 1

Regex to find substring starting with [ ]

Answers (4)

Related Questions