Raul
Raul

Reputation: 11

Reading a text file and combinig 2 lines into one using a regular expression

I am fairly new to python. I am trying to use regular expressions to match specific text in a file.

I can extract the data but only one regular expression at a time since the both values are in different lines and I am struggling to put them together. These severa lines repeat all the time in the file.

[06/05/2020 08:30:16] 
othertext           <000.000.000.000>    xx s 
example           <000.000.000.000>      xx s  

I managed to print one or the other regular expressions:

[06/05/2020 08:30:16] 

or

example           <000.000.000.000>      xx s 

But not combined into something like this:

(timestamp)             (text) 
[06/05/2020 08:30:16]   example           <000.000.000.000>      xx s

These are the regular expressions

regex = r"^\[\d\d\/\d\d\/\d\d\d\d\s\d\d\:\d\d\:\d\d\]" #Timestamp
regex = r"(^example\s+.*\<000\.000\.000\.000\>\s+.*$)" # line that contain the text

This is the code so far, I have tried a secondary for loop with another condition but seem that only match one of the regular expression at a time.

Any pointers will be greatly appreciated.

import re
filename = input("Enter the file: ")
regex = r"^\[\d\d\/\d\d\/\d\d\d\d\s\d\d\:\d\d\:\d\d\]" #Timestamp


with open (filename, "r") as file:
        list = []
        for line in file:
            for match in re.finditer(regex, line, re.S):
                match_text = match.group()
                list.append(match_text)
                print (match_text)

Upvotes: 1

Views: 39

Answers (1)

dawg
dawg

Reputation: 103874

You can match blocks of text similar to this in one go with a regex of this type:

(^\[\d\d\/\d\d\/\d\d\d\d[ ]+\d\d:\d\d:\d\d\])\s+[\s\S]*?(^example.*)

Demo

All the file's text needs to be 'gulped' to do so however.

The key elements of the regex:

[\s\S]*?
   ^                idiomatically, this matches all characters in regex
      ^             zero or more
       ^            not greedily or the rest of the text will match skipping 
                    the (^example.*) part

Upvotes: 1

Related Questions