Bhanu
Bhanu

Reputation: 1

How to get data from optional group in Python regexp?

data 1

Hi there: first

Hello: second

welcome: third

data 2

Hi there: first

welcome: third

My intention is to write a regex code to get the above bold text. In data2 Hello: is missing. how to handle it in a single regex?

My code is:

import re

mat = re.search(r"Hi there:\n(.*)\n(Hello:\n(.*))?\nwelcome:\n(.*)", data1, re.DOTALL)
print(mat)
print(mat.group(1))
print(mat.group(2))
print(mat.group(3))

output I'm getting:

<_sre.SRE_Match object at 0x10694aca8>
first   -> 

Hello: second None None

Upvotes: 0

Views: 54

Answers (1)

The fourth bird
The fourth bird

Reputation: 163577

You could use 3 groups and make the second group optional. You can omit the re.DOTALL and instead match 0 or more whitespace chars \s* after matching the newline.

(Hi there:)\r?\n\s*(?:(Hello:)\r?\n\s*)?(welcome:)

Regex demo | Python demo

In the code you could for example check if group 2 is not None

import re

regex = r"(Hi there:)\r?\n\s*(?:(Hello:)\r?\n\s*)?(welcome:)"

data1 = ("Hi there:\n\n"
    "Hello:\n\n"
    "welcome:")

mat = re.search(regex, data1)

if mat:
    print(mat.group(1))
    if mat.group(2) is not None:
        print(mat.group(2))
    print(mat.group(3))

Upvotes: 0

Related Questions