Reputation: 1190
I am trying to extract some substring using regex from a string. I have as a parameter a word in my function, and the goal is to extract the very next word(my definition of word) after this match. I have tried lookbehind and some other logics, but I failed to obtain the results so any help is welcome.
As example, given the first case, I have as input in my function: **THttpServer**
23:25:04.805: INFO: THttpServer: transportTCPChanged(state: DISCONNECTED 2)
23:25:13.120: INFO: THttpServer: transportUDPOpened(state: Port 54)
Expected result: transportTCPChanged
and transportUDPOpened
for both cases.
Another case, I have as input CurrentUserConnection
23:25:16.622: INFO: CurrentUserConnection#1:RQ : subscribed(userID: 1)
23:25:16.622: INFO: CurrentUserConnection#8:RP : disconnected
Expected result: subscribed, disconnected
.
Things I have tried (the lookbehind changes depending on the example) on Notepad++:
(?<=THttpServer)(\w+)
: No matches
(?<=THttpServer)(.*)
: Obviously returns all the sentence, not expected match
I am bit confused, maybe it's not even possible? Or do I need some pre-processing?
Upvotes: 1
Views: 2074
Reputation: 627607
You need to match :
after THttpServer
and any non-word chars up to the word and match and capture it with (\w+)
.
E.g. you may use
THttpServer:\W*(\w+)
See the regex demo.
Details
THttpServer:
- a literal substring \W*
- any 0+ non-word chars(\w+)
- Capturing group 1 (later accessible via m.group(1)
): 1 or more word chars.See the Python demo:
import re
strs = ['23:25:04.805: INFO: THttpServer: transportTCPChanged(state: DISCONNECTED 2)',
'23:25:13.120: INFO: THttpServer: transportUDPOpened(state: Port 54)']
rx = re.compile(r'THttpServer:\W*(\w+)')
for s in strs:
m = rx.search(s)
if m:
print("Found '{}' in '{}'.".format(m.group(1), s))
Output:
Found 'transportTCPChanged' in '23:25:04.805: INFO: THttpServer: transportTCPChanged(state: DISCONNECTED 2)'.
Found 'transportUDPOpened' in '23:25:13.120: INFO: THttpServer: transportUDPOpened(state: Port 54)'.
Upvotes: 1