sonali
sonali

Reputation: 229

Capturing two different lines using regex

I want to capture two lines in one variable, like this is my input:

Rose 0 82
ABC 0 0
ABC (Backup) 0 0
ABC XYZ 637 2021
ABC XYZ (Backup) 0 0
ABC EXYZ 0 0

I Want to capture the lines which are in bold.

I tried this code:

var = re.search("ABC\s+\d+\s+ .*\n(.*)\nABC XYZ .*",file_name)

but it is giving me output like this:

ABC                           0                        0
ABC (Backup)                  0                        0
ABC XYZ                       637                      2021

and my expected output is this:

ABC                           0                        0
ABC XYZ                       637                      2021

Can someone please suggest what modification is needed.

Upvotes: 3

Views: 71

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You may use

re.search("^(ABC[ \t]+\d+[ \t].*\n).*\n(ABC[ \t]+XYZ[ \t].*)",s, re.MULTILINE)

The regex will find the match you need and capture 2 lines into separate capturing groups. Then, check if there was a match and, if yes, join the two capturing group values.

See the Python demo

import re
s="""Rose                          0                        82
ABC                           0                        0
ABC (Backup)                  0                        0
ABC XYZ                       637                      2021
ABC XYZ (Backup)              0                        0
ABC EXYZ                      0                        0"""

v = re.search("^(ABC[ \t]+\d+[ \t].*\n).*\n(ABC[ \t]+XYZ[ \t].*)",s, re.MULTILINE)
if v:
    print("{}{}".format(v.group(1), v.group(2)))

Output:

ABC                           0                        0
ABC XYZ                       637                      2021

Pattern details

  • ^ - start of a line (due to re.MULTILINE)
  • (ABC[ \t]+\d+[ \t].*\n) - Capturing group 1: ABC, 1+ spaces or tabs, 1+ digits, a space or tab and then the rest of the line with the newline
  • .*\n - whole next line
  • (ABC[ \t]+XYZ[ \t].*) - - Capturing group 2: ABC, 1+ spaces or tabs, XYZ, a space or tab and then the rest of the line.

Upvotes: 1

PeeteKeesel
PeeteKeesel

Reputation: 772

If the syntax includes a comment start as two stars than you can use this (but it will not cut two comments, if they are in one line).

^[\*]{2}(.*)[\*]{2}

If you want to find any comment with the form of **comment** use this

[\*]{2}[^\*]+[\*]{2}

Upvotes: 0

user5809739
user5809739

Reputation:

you can make use of the "^" and the "$" to catch the start and end of a line.

^\*\*.*\*\*

This will give you 2 matches to iterate through. All the matches represent blod lines, qualified by the two * in the beginning end end of a line.

Upvotes: 0

Related Questions