Python RegExp retrieve value from matching string

Question

Deal all

I faced some not trivial problem for me to parse log.

I need to go through a file and check if the line matches the patter : if YES then get ClientID specified in this line.

The line looks like :

17.02.09 10:42:31.242 TRACE [1245] GDS:     someText(SomeText).ClientID: '' -> '99071901'

So I need to get 99071901.

I tried to construct regexp search pattern, but it is not complete..stuck at 'TRACE':

regex = '(^[(\d\.)]+) ([(\d\:)]+) ([\bTRACE\b]+) ([(\d)]+) ([\bGDS\b:)]+) ([\ClientID\b])'

Script code is :

log=open('t.log','r')
for i in log:
    key=re.search(regex,i)
    print(key.group()) #print string matching 
    for g in key:
        client_id=re.seach(????,g) # find ClientIt    
log.close()

Appreciate if you give me a hint how to solve this challenge.

Thank you.

Wiktor Stribiżew · Accepted Answer

You may use capturing parentheses around those parts in the pattern that you are interested in, and then access those parts using group(n) where n is the corresponding group ID:

import re
s = "17.02.09 10:42:31.242 TRACE [1245] GDS:     someText(SomeText).ClientID: '' -> '99071901'"
regex = r"^([\d.]+)\s+([\d.:]+)\s+(TRACE)\s+\[(\d+)] GDS:.*?ClientID:\s*''\s*->\s*'(\d+)'$"
m = re.search(regex, s)
if m:
    print(m.group(1))
    print(m.group(2))
    print(m.group(3))
    print(m.group(4))
    print(m.group(5))

See the Python online demo

The pattern is

^([\d.]+)\s+([\d.:]+)\s+(TRACE)\s+\[(\d+)] GDS:.*?ClientID:\s*''\s*->\s*'(\d+)'$

See its online demo here.

Note that you have messed the character classes with groups: (...) groups subpatterns and captures them while [...] defines character classes that match single characters.

Python RegExp retrieve value from matching string

Answers (2)

Related Questions