Catherine
Catherine

Reputation: 747

Python: Get specific text in a line of a file using Regex

I am using python to search through a text log file line by line and I want to save a certain part of a line as a variable. I am using Regex but don't think I am using it correctly as I am always get None for my variable string_I_want. I was looking at other Regex questions on here and saw people adding .group() to the end of their re.search but that gives me an error. I am not the most familiar with Regex but can't figure out where am I going wrong?

Sample log file:

2016-03-08 11:23:25  test_data:0317: m=string_I_want max_count: 17655, avg_size: 320, avg_rate: 165

My script:

def get_data(log_file):

    #Read file line by line
    with open(log_file) as f:
        f = f.readlines()

        for line in f:
            date = line[0:10]
            time = line[11:19]

            string_I_want=re.search(r'/m=\w*/g',line)

            print date, time, string_I_want

Upvotes: 2

Views: 106

Answers (3)

mR.aTA
mR.aTA

Reputation: 314

Here is what you need:

import re
def get_data(logfile):
    f = open(logfile,"r")
    for line in f.readlines():
        s_i_w = re.search( r'(?<=\sm=)\S+', line).group()
        if s_i_w:
            print s_i_w
    f.close()

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

You need to remove the /.../ delimiters with the global flag, and use a capturing group:

mObj = re.search(r'm=(\w+)',line)
if mObj:
    string_I_want = mObj.group(1)

See this regex demo and the Python demo:

import re
p = r'm=(\w+)'              # Init the regex with a raw string literal (so, no need to use \\w, just \w is enough)
s = "2016-03-08 11:23:25  test_data:0317: m=string_I_want max_count: 17655, avg_size: 320, avg_rate: 165"
mObj = re.search(p, s)      # Execute a regex-based search
if mObj:                    # Check if we got a match
    print(mObj.group(1))    # DEMO: Print the Group 1 value

Pattern details:

  • m= - matches m= literal character sequence (add a space before or \b if a whole word must be matched)
  • (\w+) - Group 1 capturing 1+ alphanumeric or underscore characters. We can reference this value with the .group(1) method.

Upvotes: 2

heemayl
heemayl

Reputation: 41987

Do:

(?<=\sm=)\S+

Example:

In [135]: s = '2016-03-08 11:23:25  test_data:0317: m=string_I_want max_count: 17655, avg_size: 320, avg_rate: 165'

In [136]: re.search(r'(?<=\sm=)\S+', s).group()
Out[136]: 'string_I_want'

Upvotes: 0

Related Questions